We use Active Directory and NFSv4 and I think that we have some instructions for setting it up on CentOS 7.   It was quite involved and does require that the directory service returns UID and GID information, so have populated the RFC2307 fields in AD.   This is required for munge to work.

 

We also use AUKS (https://github.com/cea-hpc/auks and https://slurm.schedmd.com/slurm_ug_2012/auks-tutorial.pdf) so that the Kerberos keys are refreshed on the compute nodes, otherwise jobs must complete within the Kerberos key lifetime (for us 24 hours).

 

This may be overcomplicated for what you need, but it sounds as if you do not have consistent UIDs across all nodes which would create problems for munge.

 

I’ll let others chip in but I can probably find the documents used to set it up.

 

William

 

From: Richard Chang via slurm-users <slurm-users@lists.schedmd.com>
Sent: Sunday, February 4, 2024 5:39 AM
To: slurm-users@schedmd.com
Subject: [slurm-users] SLURM configuration for LDAP users

 

Hi,

I am a little new to this, so please pardon my ignorance.

I have configured slurm in my cluster and it works fine with local users. But I am not able to get it working with LDAP/SSSD authentication.

User logins using ssh are working fine. An LDAP user can login to the login, slurmctld and compute nodes, but when they try to submit jobs, slurmctld logs an error about invalid account or partition for user.

Someone said we need to add the user manually into the database using the sacctmgr command. But I am not sure we need to do this for each and every LDAP user. Yes, it does work if we add the LDAP user manually using sacctmgr. But I am not convinced this manual way is the way to do.

The documentation is not very clear about using LDAP accounts.

Saw somewhere in the list about using UsePAM=1 and copying or creating a softlink for slurm PAM module under /etc/pam.d . But it didn't work for me.

Saw somewhere else that we need to specifying LaunchParameters=enable_nss_slurm in the slurm.conf file and put slurm keyword in passwd/group entry in the /etc/nsswitch.conf file. Did these, but didn't help either.

I am bereft of ideas at present. If anyone has real world experience and can advise, I will be grateful.

Thank you,

Richard