[slurm-users] New Bright Cluster Slurm issue for AD users
John Hearns
hearnsj at googlemail.com
Wed Feb 13 20:07:13 UTC 2019
Matthew, that deserves an explanation. Bright Computing Proof of Concept
causes nightmares?
That is a pretty strong assertion. Please give more details.
On Wed, 13 Feb 2019 at 16:01, Matthew BETTINGER <
matthew.bettinger at external.total.com> wrote:
> One of the main guy Panos left Bright so no answer to your specific
> question but I hope you can get some support with it. We dumped our BC
> PoC, the sysadmin working on the PoC still has nightmares.
>
> On 2/13/19, 6:54 AM, "slurm-users on behalf of John Hearns" <
> slurm-users-bounces at lists.schedmd.com on behalf of hearnsj at googlemail.com>
> wrote:
>
> Yugendra, the Bright support guys are excellent.
> Slurm is their default choice. I would ask again. Yes, Slurm is
> technically out of scope for them, but they shoudl help a bit.
>
>
> By the way, I think your problem is that you have configured
> authentication using AD on your head node.
> BUT you have not confiured it ont he compute node images. You probably
> have to prepare a new compute node image then push that otu to the compute
> nodes.
>
>
>
>
>
>
>
>
>
>
>
>
> On Wed, 13 Feb 2019 at 12:35, Yugendra Guvvala <
> yguvvala at cambridgecomputer.com> wrote:
>
>
> Also reached out to bright computing support and they say slurm is out
> of scope for them.
>
> Thanks,
> Yugi
>
>
> On Feb 13, 2019, at 7:27 AM, Antony Cleave <antony.cleave at gmail.com>
> wrote:
>
>
>
> can you ssh to the compute node that job was trying to run on as as
> the AD user in question?
>
>
> I've seen similar issues on AD integrated systems where some nodes
> boot from a different image that have not yet been joined to the domain.
>
>
> Antony
>
>
> On Wed, 13 Feb 2019 at 04:58, Yugendra Guvvala <
> yguvvala at cambridgecomputer.com> wrote:
>
>
> Hi,
>
>
> We are bringing a new cluster online. We installed SLURM through
> Bright Cluster Manager how ever we are running into a issue here.
>
>
> We are able to run jobs as root user and users created using bright
> cluster (cmsh commands). How ever we use AD authentication for all our
> users and when we try to submit jobs to slurm using AD users we are getting
> following error message.
>
>
>
>
> srun: fatal: Invalid user id: 10952
> srun: fatal: Invalid user id: 10952
> srun: error: cnode001: task 0: Exited with exit code 1
>
>
>
> Attached is the slurm.con file for reference. Please let us know if
> you have any insight into this.
>
>
>
>
>
>
> Thanks,
> Yugi
>
>
> Yugendra Guvvala | HPC Technologist | Cambridge Computer | "Artists
> in Data Storage"
> Direct: 781-250-3273 | Cell: 806-773-4464 |
> yguvvala at cambridgecomputer.com | www.cambridgecomputer.com <
> http://www.cambridgecomputer.com>
>
>
>
> _______________________________________________________________________________________________
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190213/2ae21a0e/attachment.html>
More information about the slurm-users
mailing list