[slurm-users] pam_slurm_adopt always claims now active jobs even when they do

Wensheng Deng wd35 at nyu.edu
Thu Oct 29 12:10:20 UTC 2020


Interesting...


On Thu, Oct 29, 2020 at 7:56 AM Paul Raines <raines at nmr.mgh.harvard.edu>
wrote:

> The debugging was useful.  The problem turned out to be that I am running
> with SELINUX enabled due to corporate policy.  The issue was SELINUX is
> blocking sshd access to /var/slurm/spool/d socket files:
>
> time->Thu Oct 29 07:53:50 2020
> type=AVC msg=audit(1603972430.809:2800): avc:  denied  { write } for
> pid=403840 comm="sshd" name="rtx-05_811.4294967295" dev="md122"
> ino=2228938
> scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023
> tcontext=system_u:object_r:var_t:s0 tclass=sock_file permissive=1
>
> -- Paul Raines (
> https://urldefense.proofpoint.com/v2/url?u=http-3A__help.nmr.mgh.harvard.edu&d=DwIDaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6ZbAjmsY_ctWb2nIYk4TwA&m=55j7GZv0WW5yLS9lqlcErzo16T96n7tbbKS7vkQQGMY&s=pDHwRFWpaxduFlP_-Lbr4ptx29rgWubIiN_DSCWqBYs&e=
> )
>
>
>
> On Mon, 26 Oct 2020 9:26am, Paul Raines wrote:
>
> >
> > With debugging on I get:
> >
> > Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug:  Reading
> slurm.conf
> > file: /etc/slurm/slurm.conf
> > Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug4: found jobid =
> 808,
> > stepid = 4294967295
> > Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug4: found jobid =
> 808,
> > stepid = 0
> > Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug:  _step_connect:
> > connect() failed dir /var/slurm/spool/d node rtx-03 step 808.4294967295
> > Permission denied
> > Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug3: unable to
> connect to
> > step 808.4294967295 on rtx-03: Permission denied
> > Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: send_user_msg: Access
> denied
> > by pam_slurm_adopt: you have no active jobs on this node
> > Oct 26 09:22:33 rtx-03 sshd[176647]: pam_access(sshd:account): access
> denied
> > for user `raines' from `10.162.254.11'
> > Oct 26 09:22:33 rtx-03 sshd[176647]: fatal: Access denied for user
> raines by
> > PAM account configuration [preauth]
> >
> >
> > -- Paul Raines (
> https://urldefense.proofpoint.com/v2/url?u=http-3A__help.nmr.mgh.harvard.edu&d=DwIDaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6ZbAjmsY_ctWb2nIYk4TwA&m=55j7GZv0WW5yLS9lqlcErzo16T96n7tbbKS7vkQQGMY&s=pDHwRFWpaxduFlP_-Lbr4ptx29rgWubIiN_DSCWqBYs&e=
> )
> >
> >
> >
> > On Fri, 23 Oct 2020 11:12pm, Wensheng Deng wrote:
> >
> >>  Append ‘log_level=debug5’ to the pam_slurm_adopt line in system-auth,
> >>  restart sshd, try a new job and ssh session. then check log message in
> >>  /var/log/secure...
> >>
> >>
> >>  On Fri, Oct 23, 2020 at 9:04 PM Paul Raines <
> raines at nmr.mgh.harvard.edu>
> >>  wrote:
> >>
> >>>
> >>>  I am running Slurm 20.02.3 on CentOS 7 systems.  I have
> pam_slurm_adopt
> >>>  setup in /etc/pam.d/system-auth and slurm.conf has
> >>>  PrologFlags=Contain,X11
> >>>  I also have masked systemd-logind
> >>>
> >>>  But pam_slurm_adopt always denies login with "Access denied by
> >>>  pam_slurm_adopt: you have no active jobs on this node" even when the
> >>>  user most definitely has a job running on the node via srun
> >>>
> >>>  Any clues as to why pam_slurm_adopt thinks there is no job?
> >>>
> >>>  serena [raines] squeue
> >>>                JOBID PARTITION     NAME     USER ST       TIME  NODES
> >>>  NODELIST(REASON)
> >>>                  785    lcnrtx     tcsh   raines  R   19:44:51      1
> >>>  rtx-03
> >>>  serena [raines] ssh rtx-03
> >>>  Access denied by pam_slurm_adopt: you have no active jobs on this node
> >>>  Authentication failed.
> >>>
> >>>
> >>>
> >>
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20201029/3e157d48/attachment.htm>


More information about the slurm-users mailing list