[slurm-users] pam_slurm_adopt always claims now active jobs even when they do
William Brown
william at signalbox.org.uk
Thu Oct 29 14:25:58 UTC 2020
That is interesting as I run with SElinux enforcing.
I will do some more testing of attaching by ssh to nodes with running jobs.
William
On Thu, 29 Oct 2020, 11:58 Paul Raines, <raines at nmr.mgh.harvard.edu> wrote:
> The debugging was useful. The problem turned out to be that I am running
> with SELINUX enabled due to corporate policy. The issue was SELINUX is
> blocking sshd access to /var/slurm/spool/d socket files:
>
> time->Thu Oct 29 07:53:50 2020
> type=AVC msg=audit(1603972430.809:2800): avc: denied { write } for
> pid=403840 comm="sshd" name="rtx-05_811.4294967295" dev="md122"
> ino=2228938
> scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023
> tcontext=system_u:object_r:var_t:s0 tclass=sock_file permissive=1
>
> -- Paul Raines (http://help.nmr.mgh.harvard.edu)
>
>
>
> On Mon, 26 Oct 2020 9:26am, Paul Raines wrote:
>
> >
> > With debugging on I get:
> >
> > Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug: Reading
> slurm.conf
> > file: /etc/slurm/slurm.conf
> > Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug4: found jobid =
> 808,
> > stepid = 4294967295
> > Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug4: found jobid =
> 808,
> > stepid = 0
> > Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug: _step_connect:
> > connect() failed dir /var/slurm/spool/d node rtx-03 step 808.4294967295
> > Permission denied
> > Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug3: unable to
> connect to
> > step 808.4294967295 on rtx-03: Permission denied
> > Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: send_user_msg: Access
> denied
> > by pam_slurm_adopt: you have no active jobs on this node
> > Oct 26 09:22:33 rtx-03 sshd[176647]: pam_access(sshd:account): access
> denied
> > for user `raines' from `10.162.254.11'
> > Oct 26 09:22:33 rtx-03 sshd[176647]: fatal: Access denied for user
> raines by
> > PAM account configuration [preauth]
> >
> >
> > -- Paul Raines (http://help.nmr.mgh.harvard.edu)
> >
> >
> >
> > On Fri, 23 Oct 2020 11:12pm, Wensheng Deng wrote:
> >
> >> Append ‘log_level=debug5’ to the pam_slurm_adopt line in system-auth,
> >> restart sshd, try a new job and ssh session. then check log message in
> >> /var/log/secure...
> >>
> >>
> >> On Fri, Oct 23, 2020 at 9:04 PM Paul Raines <
> raines at nmr.mgh.harvard.edu>
> >> wrote:
> >>
> >>>
> >>> I am running Slurm 20.02.3 on CentOS 7 systems. I have
> pam_slurm_adopt
> >>> setup in /etc/pam.d/system-auth and slurm.conf has
> >>> PrologFlags=Contain,X11
> >>> I also have masked systemd-logind
> >>>
> >>> But pam_slurm_adopt always denies login with "Access denied by
> >>> pam_slurm_adopt: you have no active jobs on this node" even when the
> >>> user most definitely has a job running on the node via srun
> >>>
> >>> Any clues as to why pam_slurm_adopt thinks there is no job?
> >>>
> >>> serena [raines] squeue
> >>> JOBID PARTITION NAME USER ST TIME NODES
> >>> NODELIST(REASON)
> >>> 785 lcnrtx tcsh raines R 19:44:51 1
> >>> rtx-03
> >>> serena [raines] ssh rtx-03
> >>> Access denied by pam_slurm_adopt: you have no active jobs on this node
> >>> Authentication failed.
> >>>
> >>>
> >>>
> >>
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20201029/0190f568/attachment.htm>
More information about the slurm-users
mailing list