[slurm-users] pam_slurm_adopt not working for all users
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Tue May 25 12:38:46 UTC 2021
Hi Loris,
I think you need, as pointed out by others, either of:
* SSH keys, see
https://wiki.fysik.dtu.dk/niflheim/SLURM#ssh-keys-for-password-less-access-to-cluster-nodes
* SSH host-base authentication, see
https://wiki.fysik.dtu.dk/niflheim/SLURM#host-based-authentication
/Ole
On 5/25/21 2:09 PM, Loris Bennett wrote:
> Hi everyone,
>
> Thanks for all the replies.
>
> I think my main problem is that I expect logging in to a node with a job
> to work with pam_slurm_adopt but without any SSH keys. My assumption
> was that MUNGE takes care of the authentication, since users' jobs start
> on nodes with the need for keys.
>
> Can someone confirm that this expectation is wrong and, if possible, why
> the analogy with jobs is incorrect?
>
> I have a vague memory that this used work on our old cluster with an
> older version of Slurm, but I could be thinking of a time before we set
> up pam_slurm_adopt.
>
> Cheers,
>
> Loris
>
>
> Brian Andrus <toomuchit at gmail.com> writes:
>
>> Oh, you could also use the ssh-agent to mange the keys, then use 'ssh-add
>> ~/.ssh/id_rsa' to type the passphrase once for your whole session (from that
>> system).
>>
>> Brian Andrus
>>
>>
>> On 5/21/2021 5:53 AM, Loris Bennett wrote:
>>> Hi,
>>>
>>> We have set up pam_slurm_adopt using the official Slurm documentation
>>> and Ole's information on the subject. It works for a user who has SSH
>>> keys set up, albeit the passphrase is needed:
>>>
>>> $ salloc --partition=gpu --gres=gpu:1 --qos=hiprio --ntasks=1 --time=00:30:00 --mem=100
>>> salloc: Granted job allocation 7202461
>>> salloc: Waiting for resource configuration
>>> salloc: Nodes g003 are ready for job
>>>
>>> $ ssh g003
>>> Warning: Permanently added 'g003' (ECDSA) to the list of known hosts.
>>> Enter passphrase for key '/home/loris/.ssh/id_rsa':
>>> Last login: Wed May 5 08:50:00 2021 from login.curta.zedat.fu-berlin.de
>>>
>>> $ ssh g004
>>> Warning: Permanently added 'g004' (ECDSA) to the list of known hosts.
>>> Enter passphrase for key '/home/loris/.ssh/id_rsa':
>>> Access denied: user loris (uid=182317) has no active jobs on this node.
>>> Access denied by pam_slurm_adopt: you have no active jobs on this node
>>> Authentication failed.
>>>
>>> If SSH keys are not set up, then the user is asked for a password:
>>>
>>> $ squeue --me
>>> JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
>>> 7201647 main test_job nokeylee R 3:45:24 1 c005
>>> 7201646 main test_job nokeylee R 3:46:09 1 c005
>>> $ ssh c005
>>> Warning: Permanently added 'c005' (ECDSA) to the list of known hosts.
>>> nokeylee at c005's password:
>>>
>>> My assumption was that a user should be able to log into a node on which
>>> that person has a running job without any further ado, i.e. without the
>>> necessity to set up anything else or to enter any credentials.
>>>
>>> Is this assumption correct?
>>>
>>> If so, how can I best debug what I have done wrong?
>>>
More information about the slurm-users
mailing list