[slurm-users] New Bright Cluster Slurm issue for AD users

John Hearns hearnsj at googlemail.com
Wed Feb 13 12:58:34 UTC 2019


please have a look at section 6.3 of the Bright Admin Manual
You have run updateprovisioners then rebooted the nodes?


Configuring The Cluster To Authenticate Against An External LDAP Server The
cluster can be configured in different ways to authenticate against an
external LDAP server. For smaller clusters, a configuration where LDAP
clients on all nodes point directly to the external server is recommended.
An easy way to set this up is as follows:
• On the head node:
– In distributions that are: * derived from prior to RHEL 6: the URIs in
/etc/ldap.conf, and in the image file
/cm/images/default-image/etc/ldap.confaresettopointtotheexternalLDAP
server. * derived from the RHEL 6.x series: the file /etc/ldap.conf does not
exist. The files in which the changes then need to be made are
/etc/nslcd.conf and /etc/pam_ldap.conf. To implement the changes, the nslcd
daemon must then be restarted, for example with service nslcd restart. *
derived from RHEL 7.x series: the file /etc/ldap.conf does not exist. The
files in which the changes then need to be made are /etc/nslcd.conf and
/etc/openldap/ldap.conf. To implement the changes, the nslcd daemon must
then be restarted, for example with service nslcd restart.
© Bright Computing, Inc.
214 User Management
–
theupdateprovisionerscommand(section5.2.4)isruntoupdateanyotherprovisioners.
• Then, to update configurations on the regular nodes so that they are able
to do LDAP lookups:
– They can simply be rebooted to pick up the updated configuration, along
with the new software image. – Alternatively, to avoid a reboot, the
imageupdate command (section 5.6.2) can be run to pick up the new software
image from a provisioner.

On Wed, 13 Feb 2019 at 12:55, Antony Cleave <antony.cleave at gmail.com> wrote:

> Can you ssh in as root and the su to the AD user to make sure that the
> node is integrated correctly?
>
> If you cannot su to an AD user on the node then Slurm will not be able to
> resolve the UID either as they use the same methods.
>
> On Wed, 13 Feb 2019, 12:35 Yugendra Guvvala, <
> yguvvala at cambridgecomputer.com> wrote:
>
>> No, we can’t ssh to compute nodes. And this is by design that no one
>> should be able to ssh to compute nodes other than root.
>>
>> I figure that munge is not configured for AD. We have configured our
>> login image for AD and slurm and mung configurations are on head node. Not
>> sure how to integrate these.
>>
>> Thanks,
>> Yugi
>>
>> On Feb 13, 2019, at 7:27 AM, Antony Cleave <antony.cleave at gmail.com>
>> wrote:
>>
>> can you ssh to the compute node that job was trying to run on as as the
>> AD user in question?
>>
>> I've  seen similar issues on AD integrated systems where some nodes boot
>> from a different image that have not yet been joined to the domain.
>>
>> Antony
>>
>> On Wed, 13 Feb 2019 at 04:58, Yugendra Guvvala <
>> yguvvala at cambridgecomputer.com> wrote:
>>
>>> Hi,
>>>
>>> We are bringing a new cluster online. We installed SLURM through Bright
>>> Cluster Manager how ever we are running into a issue here.
>>>
>>> We are able to run jobs as root user and users created using bright
>>> cluster (cmsh commands). How ever we use AD authentication for all our
>>> users and when we try to submit jobs to slurm using AD users we are getting
>>> following error message.
>>>
>>>
>>> srun: fatal: Invalid user id: 10952
>>> srun: fatal: Invalid user id: 10952
>>> srun: error: cnode001: task 0: Exited with exit code 1
>>>
>>> Attached is the slurm.con file for reference. Please let us know if you
>>> have any insight into this.
>>>
>>>
>>>
>>> Thanks,
>>> Yugi
>>>
>>> *Yugendra Guvvala | HPC Technologist ** |** Cambridge Computer ** |** "Artists
>>> in Data Storage" *
>>> *Direct:* 781-250-3273  | *Cell*: 806-773-4464  |
>>> yguvvala at cambridgecomputer.com  | www.cambridgecomputer.com
>>>
>>>
>>> _______________________________________________________________________________________________
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190213/ef1754cc/attachment-0001.html>


More information about the slurm-users mailing list