<div dir="auto">That is interesting as I run with SElinux enforcing. <div dir="auto"><br></div><div dir="auto">I will do some more testing of attaching by ssh to nodes with running jobs.</div><div dir="auto"><br></div><div dir="auto">William </div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 29 Oct 2020, 11:58 Paul Raines, <<a href="mailto:raines@nmr.mgh.harvard.edu">raines@nmr.mgh.harvard.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The debugging was useful. The problem turned out to be that I am running<br>
with SELINUX enabled due to corporate policy. The issue was SELINUX is<br>
blocking sshd access to /var/slurm/spool/d socket files:<br>
<br>
time->Thu Oct 29 07:53:50 2020<br>
type=AVC msg=audit(1603972430.809:2800): avc: denied { write } for <br>
pid=403840 comm="sshd" name="rtx-05_811.4294967295" dev="md122" ino=2228938 <br>
scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023 <br>
tcontext=system_u:object_r:var_t:s0 tclass=sock_file permissive=1<br>
<br>
-- Paul Raines (<a href="http://help.nmr.mgh.harvard.edu" rel="noreferrer noreferrer" target="_blank">http://help.nmr.mgh.harvard.edu</a>)<br>
<br>
<br>
<br>
On Mon, 26 Oct 2020 9:26am, Paul Raines wrote:<br>
<br>
><br>
> With debugging on I get:<br>
><br>
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug: Reading slurm.conf <br>
> file: /etc/slurm/slurm.conf<br>
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug4: found jobid = 808, <br>
> stepid = 4294967295<br>
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug4: found jobid = 808, <br>
> stepid = 0<br>
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug: _step_connect: <br>
> connect() failed dir /var/slurm/spool/d node rtx-03 step 808.4294967295 <br>
> Permission denied<br>
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug3: unable to connect to <br>
> step 808.4294967295 on rtx-03: Permission denied<br>
> Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: send_user_msg: Access denied <br>
> by pam_slurm_adopt: you have no active jobs on this node<br>
> Oct 26 09:22:33 rtx-03 sshd[176647]: pam_access(sshd:account): access denied <br>
> for user `raines' from `10.162.254.11'<br>
> Oct 26 09:22:33 rtx-03 sshd[176647]: fatal: Access denied for user raines by <br>
> PAM account configuration [preauth]<br>
><br>
><br>
> -- Paul Raines (<a href="http://help.nmr.mgh.harvard.edu" rel="noreferrer noreferrer" target="_blank">http://help.nmr.mgh.harvard.edu</a>)<br>
><br>
><br>
><br>
> On Fri, 23 Oct 2020 11:12pm, Wensheng Deng wrote:<br>
><br>
>> Append ‘log_level=debug5’ to the pam_slurm_adopt line in system-auth,<br>
>> restart sshd, try a new job and ssh session. then check log message in<br>
>> /var/log/secure...<br>
>> <br>
>><br>
>> On Fri, Oct 23, 2020 at 9:04 PM Paul Raines <<a href="mailto:raines@nmr.mgh.harvard.edu" target="_blank" rel="noreferrer">raines@nmr.mgh.harvard.edu</a>><br>
>> wrote:<br>
>> <br>
>>><br>
>>> I am running Slurm 20.02.3 on CentOS 7 systems. I have pam_slurm_adopt<br>
>>> setup in /etc/pam.d/system-auth and slurm.conf has<br>
>>> PrologFlags=Contain,X11<br>
>>> I also have masked systemd-logind<br>
>>><br>
>>> But pam_slurm_adopt always denies login with "Access denied by<br>
>>> pam_slurm_adopt: you have no active jobs on this node" even when the<br>
>>> user most definitely has a job running on the node via srun<br>
>>><br>
>>> Any clues as to why pam_slurm_adopt thinks there is no job?<br>
>>><br>
>>> serena [raines] squeue<br>
>>> JOBID PARTITION NAME USER ST TIME NODES<br>
>>> NODELIST(REASON)<br>
>>> 785 lcnrtx tcsh raines R 19:44:51 1<br>
>>> rtx-03<br>
>>> serena [raines] ssh rtx-03<br>
>>> Access denied by pam_slurm_adopt: you have no active jobs on this node<br>
>>> Authentication failed.<br>
>>> <br>
>>> <br>
>>> <br>
>> <br>
></blockquote></div>