[slurm-users] pam_slurm_adopt working on only some nodes

Wayne Hendricks waynehendricks at gmail.com
Fri Jan 28 17:27:36 UTC 2022


Any idea why pam_slurm_adopt would work on some nodes but not others? Here is an excerpt from one of the nodes:

Jan 28 15:38:54 dgx1-1 sshd[1027640]: pam_sss(sshd:auth): authentication success; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.10.10.1 user=test.user
Jan 28 15:38:54 dgx1-1 pam_slurm_adopt[1027640]: debug2: _establish_config_source: using config_file=/admin/slurm/slurm-21.08.5/etc/slurm.conf (default)
Jan 28 15:38:54 dgx1-1 pam_slurm_adopt[1027640]: debug:  slurm_conf_init: using config_file=/admin/slurm/slurm-21.08.5/etc/slurm.conf
Jan 28 15:38:54 dgx1-1 pam_slurm_adopt[1027640]: debug:  Reading slurm.conf file: /admin/slurm/slurm-21.08.5/etc/slurm.conf
Jan 28 15:38:54 dgx1-1 pam_slurm_adopt[1027640]: debug:  Reading cgroup.conf file /admin/slurm/slurm-21.08.5/etc/cgroup.conf
Jan 28 15:38:54 dgx1-1 pam_slurm_adopt[1027640]: debug4: found StepId=182409.0
Jan 28 15:38:54 dgx1-1 pam_slurm_adopt[1027640]: send_user_msg: Access denied by pam_slurm_adopt: you have no active jobs on this node
Jan 28 15:38:54 dgx1-1 sshd[1027640]: pam_access(sshd:account): access denied for user `test.user' from `10.10.10.1'

Squeue:
182409      v100     bash test.user  R    1:43:58      1 dgx1-1

Other nodes using the exact same config seem to work just fine. The debug doesn’t show much information. Could this be related to cgroups/adoption? Where could I get more information? The only difference I can think of is the nodes that are working seem to be built more recently than the others, but are patched to the same levels and get the same config.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220128/ba7249e4/attachment.htm>


More information about the slurm-users mailing list