[slurm-users] pam_slurm_adopt always claims now active jobs even when they do
Paul Raines
raines at nmr.mgh.harvard.edu
Thu Oct 29 11:56:30 UTC 2020
The debugging was useful. The problem turned out to be that I am running
with SELINUX enabled due to corporate policy. The issue was SELINUX is
blocking sshd access to /var/slurm/spool/d socket files:
time->Thu Oct 29 07:53:50 2020
type=AVC msg=audit(1603972430.809:2800): avc: denied { write } for
pid=403840 comm="sshd" name="rtx-05_811.4294967295" dev="md122" ino=2228938
scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023
tcontext=system_u:object_r:var_t:s0 tclass=sock_file permissive=1
-- Paul Raines (http://help.nmr.mgh.harvard.edu)
On Mon, 26 Oct 2020 9:26am, Paul Raines wrote:
>
> With debugging on I get:
>
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug: Reading slurm.conf
> file: /etc/slurm/slurm.conf
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug4: found jobid = 808,
> stepid = 4294967295
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug4: found jobid = 808,
> stepid = 0
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug: _step_connect:
> connect() failed dir /var/slurm/spool/d node rtx-03 step 808.4294967295
> Permission denied
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug3: unable to connect to
> step 808.4294967295 on rtx-03: Permission denied
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: send_user_msg: Access denied
> by pam_slurm_adopt: you have no active jobs on this node
> Oct 26 09:22:33 rtx-03 sshd[176647]: pam_access(sshd:account): access denied
> for user `raines' from `10.162.254.11'
> Oct 26 09:22:33 rtx-03 sshd[176647]: fatal: Access denied for user raines by
> PAM account configuration [preauth]
>
>
> -- Paul Raines (http://help.nmr.mgh.harvard.edu)
>
>
>
> On Fri, 23 Oct 2020 11:12pm, Wensheng Deng wrote:
>
>> Append ‘log_level=debug5’ to the pam_slurm_adopt line in system-auth,
>> restart sshd, try a new job and ssh session. then check log message in
>> /var/log/secure...
>>
>>
>> On Fri, Oct 23, 2020 at 9:04 PM Paul Raines <raines at nmr.mgh.harvard.edu>
>> wrote:
>>
>>>
>>> I am running Slurm 20.02.3 on CentOS 7 systems. I have pam_slurm_adopt
>>> setup in /etc/pam.d/system-auth and slurm.conf has
>>> PrologFlags=Contain,X11
>>> I also have masked systemd-logind
>>>
>>> But pam_slurm_adopt always denies login with "Access denied by
>>> pam_slurm_adopt: you have no active jobs on this node" even when the
>>> user most definitely has a job running on the node via srun
>>>
>>> Any clues as to why pam_slurm_adopt thinks there is no job?
>>>
>>> serena [raines] squeue
>>> JOBID PARTITION NAME USER ST TIME NODES
>>> NODELIST(REASON)
>>> 785 lcnrtx tcsh raines R 19:44:51 1
>>> rtx-03
>>> serena [raines] ssh rtx-03
>>> Access denied by pam_slurm_adopt: you have no active jobs on this node
>>> Authentication failed.
>>>
>>>
>>>
>>
>
More information about the slurm-users
mailing list