With GSSAPI enabled the user logs in fine with no further password prompt, but they cannot access any of the NFSv4 disks.  I don’t know how GSSAPI works but I presume that it is passing on to the target system an existing Kerberos ticket, which is accepted for login; but it cannot be forwarded again to the storage system to authenticate access to the storage.

 

My take is that is why we have to enable in AD the constrained delegation to the storage system’s AD machine account of the nfs/ SPNs, on each server that needs to access the storage.   I think it is that step which is allowing the server (e.g. login node) to take the user’s username & password, get a ticket from AD and pass that to the storage array, and that is allowing access to the storage.  So far as I know the ticket cannot be double-hopped as it were.

 

What AUKS is doing is managing credential renewals and passing the credential to the compute nodes.   It is very well explained here: https://slurm.schedmd.com/slurm_ug_2012/auks-tutorial.pdf.

 

It is quite possible that there are ways to combine GSSAPI and the NFSv4 mounts but what we have works and the users seem OK with it; they only have to tweak the SSH settings the once in PuTTY or MobaXterm.   I’ve seen some users try to setup ssh keys and we just tell them “it will not work, by design” and also that as we may at times handle Personal Data we like it the way it is.   I can see that it would be more impactful if you had developers with an IDE like VScode who want to edit in the IDE and just click to run their Python on the remote cluster, but we don’t have that.

 

I could PM you more detail of the documented steps that we use, though now we have it across a few scripts that we use and they likely aren’t so portable.   We followed RedHat documents to get NFSv4 working.  We already almost always AD-joined our servers for a while to ensure UID consistency everywhere.   We moved to NFSv4 a few years back once we figured out how to make it work, and we user sec=krb5i on most mounts and sec=krb5p on a very few (that is known to hit performance more).

 

William

 

From: Burian, John via slurm-users <slurm-users@lists.schedmd.com>
Sent: 30 April 2025 20:52
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] Re: Slurm and Kerberos/GSSAPI

 

William,

 

Thanks for the information. I’m in the same boat, NFSv4 filesystem. I don’t follow this:

 

“We do have to get the users to *disable* GSSAPI in the ssh client (we have instructions for PuTTY and MobaXterm) because the login node absolutely needs to the username and password in order to get the Kerberos credential.”

 

Is that because the AUKS mechanism doesn’t work right with credentials forwarded through SSH?

 

John

 

From: william@signalbox.org.uk <william.d.l.brown@gmail.com>
Date: Wednesday, April 30, 2025 at 12:26
PM
To: Burian, John <John.Burian@nationwidechildrens.org>, slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com>
Subject: RE: [slurm-users] Slurm and Kerberos/GSSAPI

Yes we do. We run a cluster which uses NFSv4 exclusively to access the shared file system (on Dell PowerScale), so all users need the Kerberos tickets from Active Directory to even access their login directories. We use the RFC2307 attributes

ZjQcmQRYFpfptBannerStart

This Message Is From an Untrusted Sender

You have not previously corresponded with this sender.

Search “email warning banner” on ANCHOR for more information

 

ZjQcmQRYFpfptBannerEnd

Yes we do.   We run a cluster which uses NFSv4 exclusively to access the shared file system (on Dell PowerScale), so all users need the Kerberos tickets from Active Directory to even access their login directories.   We use the RFC2307 attributes in AD to provide a consistent UID and GID for all our users.

 

It does have some implications….

 

We use AUKS to manage credential renewal and propagation to the compute nodes.  We have documentation based on the original CERN documents and information on the SchedMD Slurm site.

 

We do have to get the users to *disable* GSSAPI in the ssh client (we have instructions for PuTTY and MobaXterm) because the login node absolutely needs to the username and password in order to get the Kerberos credential.   The nodes are all AD-joined and have constrained delegation enabled so that they can find the SPNs for the NFS services and pass on the tickets.

 

It also means that users cannot use ssh keys, so things like using the cluster as a back-end for VScode are not going to work (or not work seamlessly)

 

There are quite a few moving parts but it does all work.  It took a while to get to that point.   It helps a lot that I have access myself to <everything> so I do not have to beg an AD team to do/tell me.

 

It

 

From: Burian, John via slurm-users <slurm-users@lists.schedmd.com>
Sent: 30 April 2025 15:39
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] Slurm and Kerberos/GSSAPI

 

Does anyone have any experience with using Kerberos/GSSAPI and Slurm? I’m specifically wondering if there is a known mechanism for providing proper Kerberos credentials to Slurm batch jobs, such that those processes would be able to access a filesystem that requires Kerberos credentials. Some quick searching returned nothing useful. Interactive jobs have a similar problem, but I’m hoping that SSH credential forwarding can be leveraged there.

 

I’m nothing like an expert in Kerberos, so forgive any apparent ignorance.

 

Thanks,
John