[slurm-users] pam_slurm_adopt always claims now active jobs even when they do

Paul Raines raines at nmr.mgh.harvard.edu
Thu Oct 29 11:56:30 UTC 2020


The debugging was useful.  The problem turned out to be that I am running
with SELINUX enabled due to corporate policy.  The issue was SELINUX is
blocking sshd access to /var/slurm/spool/d socket files:

time->Thu Oct 29 07:53:50 2020
type=AVC msg=audit(1603972430.809:2800): avc:  denied  { write } for 
pid=403840 comm="sshd" name="rtx-05_811.4294967295" dev="md122" ino=2228938 
scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023 
tcontext=system_u:object_r:var_t:s0 tclass=sock_file permissive=1

-- Paul Raines (http://help.nmr.mgh.harvard.edu)



On Mon, 26 Oct 2020 9:26am, Paul Raines wrote:

>
> With debugging on I get:
>
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug:  Reading slurm.conf 
> file: /etc/slurm/slurm.conf
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug4: found jobid = 808, 
> stepid = 4294967295
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug4: found jobid = 808, 
> stepid = 0
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug:  _step_connect: 
> connect() failed dir /var/slurm/spool/d node rtx-03 step 808.4294967295 
> Permission denied
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug3: unable to connect to 
> step 808.4294967295 on rtx-03: Permission denied
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: send_user_msg: Access denied 
> by pam_slurm_adopt: you have no active jobs on this node
> Oct 26 09:22:33 rtx-03 sshd[176647]: pam_access(sshd:account): access denied 
> for user `raines' from `10.162.254.11'
> Oct 26 09:22:33 rtx-03 sshd[176647]: fatal: Access denied for user raines by 
> PAM account configuration [preauth]
>
>
> -- Paul Raines (http://help.nmr.mgh.harvard.edu)
>
>
>
> On Fri, 23 Oct 2020 11:12pm, Wensheng Deng wrote:
>
>>  Append ‘log_level=debug5’ to the pam_slurm_adopt line in system-auth,
>>  restart sshd, try a new job and ssh session. then check log message in
>>  /var/log/secure...
>> 
>>
>>  On Fri, Oct 23, 2020 at 9:04 PM Paul Raines <raines at nmr.mgh.harvard.edu>
>>  wrote:
>> 
>>>
>>>  I am running Slurm 20.02.3 on CentOS 7 systems.  I have pam_slurm_adopt
>>>  setup in /etc/pam.d/system-auth and slurm.conf has
>>>  PrologFlags=Contain,X11
>>>  I also have masked systemd-logind
>>>
>>>  But pam_slurm_adopt always denies login with "Access denied by
>>>  pam_slurm_adopt: you have no active jobs on this node" even when the
>>>  user most definitely has a job running on the node via srun
>>>
>>>  Any clues as to why pam_slurm_adopt thinks there is no job?
>>>
>>>  serena [raines] squeue
>>>                JOBID PARTITION     NAME     USER ST       TIME  NODES
>>>  NODELIST(REASON)
>>>                  785    lcnrtx     tcsh   raines  R   19:44:51      1
>>>  rtx-03
>>>  serena [raines] ssh rtx-03
>>>  Access denied by pam_slurm_adopt: you have no active jobs on this node
>>>  Authentication failed.
>>> 
>>> 
>>> 
>> 
>


More information about the slurm-users mailing list