[slurm-users] pam_slurm_adopt not working for all users

Lloyd Brown lloyd_brown at byu.edu
Tue May 25 15:15:58 UTC 2021


We had something similar happen, when we migrated away from a 
Rocks-based cluster.  We used a script like the one attached, in 
/etc/profile.d, which was modeled heavily by something similar in Rocks.

You might need to adapt it a bit for your situation, but otherwise it's 
pretty straightforward.

Lloyd

-- 
Lloyd Brown
HPC Systems Administrator
Office of Research Computing
Brigham Young University
http://marylou.byu.edu



On 5/25/21 8:56 AM, Loris Bennett wrote:
> Hi Ole,
>
> Thanks for the links.
>
> I have discovered that the users whose /home directories were migrated
> from our previous cluster all seem to have a pair of keys which were
> created along with files like '~/.bash_profile'.  Users who have been
> set up on the new cluster don't have these files.
>
> Is there some /etc/skel-like mechanism which will create passwordless
> SSH keys when a user logs into the system for the first time?  It looks
> increasingly to me that such a mechanism must have existed on our old
> cluster.
>
> Cheers,
>
> Loris
>
>   
> Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk> writes:
>
>> Hi Loris,
>>
>> I think you need, as pointed out by others, either of:
>>
>> * SSH keys, see
>> https://wiki.fysik.dtu.dk/niflheim/SLURM#ssh-keys-for-password-less-access-to-cluster-nodes
>>
>> * SSH host-base authentication, see
>> https://wiki.fysik.dtu.dk/niflheim/SLURM#host-based-authentication
>>
>> /Ole
>>
>> On 5/25/21 2:09 PM, Loris Bennett wrote:
>>> Hi everyone,
>>>
>>> Thanks for all the replies.
>>>
>>> I think my main problem is that I expect logging in to a node with a job
>>> to work with pam_slurm_adopt but without any SSH keys.  My assumption
>>> was that MUNGE takes care of the authentication, since users' jobs start
>>> on nodes with the need for keys.
>>>
>>> Can someone confirm that this expectation is wrong and, if possible, why
>>> the analogy with jobs is incorrect?
>>>
>>> I have a vague memory that this used work on our old cluster with an
>>> older version of Slurm, but I could be thinking of a time before we set
>>> up pam_slurm_adopt.
>>>
>>> Cheers,
>>>
>>> Loris
>>>     
>>>
>>> Brian Andrus <toomuchit at gmail.com> writes:
>>>
>>>> Oh, you could also use the ssh-agent to mange the keys, then use 'ssh-add
>>>> ~/.ssh/id_rsa' to type the passphrase once for your whole session (from that
>>>> system).
>>>>
>>>> Brian Andrus
>>>>
>>>>
>>>> On 5/21/2021 5:53 AM, Loris Bennett wrote:
>>>>> Hi,
>>>>>
>>>>> We have set up pam_slurm_adopt using the official Slurm documentation
>>>>> and Ole's information on the subject.  It works for a user who has SSH
>>>>> keys set up, albeit the passphrase is needed:
>>>>>
>>>>>      $ salloc --partition=gpu --gres=gpu:1 --qos=hiprio --ntasks=1 --time=00:30:00 --mem=100
>>>>>      salloc: Granted job allocation 7202461
>>>>>      salloc: Waiting for resource configuration
>>>>>      salloc: Nodes g003 are ready for job
>>>>>
>>>>>      $ ssh g003
>>>>>      Warning: Permanently added 'g003' (ECDSA) to the list of known hosts.
>>>>>      Enter passphrase for key '/home/loris/.ssh/id_rsa':
>>>>>      Last login: Wed May  5 08:50:00 2021 from login.curta.zedat.fu-berlin.de
>>>>>
>>>>>      $ ssh g004
>>>>>      Warning: Permanently added 'g004' (ECDSA) to the list of known hosts.
>>>>>      Enter passphrase for key '/home/loris/.ssh/id_rsa':
>>>>>      Access denied: user loris (uid=182317) has no active jobs on this node.
>>>>>      Access denied by pam_slurm_adopt: you have no active jobs on this node
>>>>>      Authentication failed.
>>>>>
>>>>> If SSH keys are not set up, then the user is asked for a password:
>>>>>
>>>>>      $ squeue --me
>>>>>                   JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
>>>>>                 7201647      main test_job nokeylee  R    3:45:24      1 c005
>>>>>                 7201646      main test_job nokeylee  R    3:46:09      1 c005
>>>>>      $ ssh c005
>>>>>      Warning: Permanently added 'c005' (ECDSA) to the list of known hosts.
>>>>>      nokeylee at c005's password:
>>>>>
>>>>> My assumption was that a user should be able to log into a node on which
>>>>> that person has a running job without any further ado, i.e. without the
>>>>> necessity to set up anything else or to enter any credentials.
>>>>>
>>>>> Is this assumption correct?
>>>>>
>>>>> If so, how can I best debug what I have done wrong?
>>>>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ssh-key.sh
Type: application/x-shellscript
Size: 732 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210525/e21dfa82/attachment.bin>


More information about the slurm-users mailing list