[slurm-users] Issues with pam_slurm_adopt
Juergen Salk
juergen.salk at uni-ulm.de
Fri Apr 8 21:55:59 UTC 2022
Hi Nicolas,
it looks like you have pam_access.so placed in your PAM stack *before*
pam_slurm_adopt.so so this may get in your way. In fact, the logs
indicate that it's pam_access and not pam_slurm_adopt that denies access
in the first place:
Apr 8 19:11:32 magi46 sshd[20542]: pam_access(sshd:account): access denied for user `nicolas.greneche' from `172.16.0.3'
Maybe the following web page is useful for you in order to setup
your PAM stack with pam_slurm_adopt:
https://slurm.schedmd.com/pam_slurm_adopt.html
--- snip ---
If you always want to allow access for an administrative group (e.g.,
wheel), stack the pam_access module after pam_slurm_adopt. A success
with pam_slurm_adopt is sufficient to allow access, but the pam_access
module can allow others, such as administrative staff, access even
without jobs on that node:
account sufficient pam_slurm_adopt.so
account required pam_access.so
--- snip ---
We did it that way and this works fine for us. There is just one
drawback, though, namely that adminisitrative users that are allowed
to access compute nodes without having jobs on them do always get
an annoying message from pam_slurm_adopt when doing so, even though
login succeeds:
Access denied by pam_slurm_adopt: you have no active jobs on this node
We've gotten used to it, but now that I see it on the web page, maybe
I'll take a look at the alternative approach with pam_listfile.so.
Best regards
Jürgen
* Nicolas Greneche <nicolas.greneche at univ-paris13.fr> [220408 19:53]:
> Hi,
>
> I have an issue with pam_slurm_adopt when I moved from 21.08.5 to 21.08.6.
> It no longer works.
>
> When I log straight to the node with root account :
>
> Apr 8 19:06:49 magi46 pam_slurm_adopt[20400]: Ignoring root user
> Apr 8 19:06:49 magi46 sshd[20400]: Accepted publickey for root from
> 172.16.0.3 port 50884 ssh2: ...
> Apr 8 19:06:49 magi46 sshd[20400]: pam_unix(sshd:session): session opened
> for user root(uid=0) by (uid=0)
>
> Everything is OK.
>
> I submit a very simple job, an infinite loop to keep the first compute node
> busy :
>
> nicolas.greneche at magi3:~/test-bullseye/infinite$ cat infinite.slurm
> #!/bin/bash
> #SBATCH --job-name=infinite
> #SBATCH --output=%x.%j.out
> #SBATCH --error=%x.%j.err
> #SBATCH --nodes=1
> srun infinite.sh
>
> nicolas.greneche at magi3:~/test-bullseye/infinite$ sbatch infinite.slurm
> Submitted batch job 203
>
> nicolas.greneche at magi3:~/test-bullseye/infinite$ squeue
> JOBID PARTITION NAME USER ST TIME NODES
> NODELIST(REASON)
> 203 COMPUTE infinite nicolas. R 0:03 1 magi46
>
> I have a job running on the node. When I try to log on the node with the
> same regular account :
>
> nicolas.greneche at magi3:~/test-bullseye/infinite$ ssh magi46
> Access denied by pam_slurm_adopt: you have no active jobs on this node
> Connection closed by 172.16.0.46 port 22
>
> In the auth.log, we can see that the job found (JOBID 203) is found but the
> PAM decides that I have no running job on node :
>
> Apr 8 19:11:32 magi46 sshd[20542]: pam_access(sshd:account): access denied
> for user `nicolas.greneche' from `172.16.0.3'
> Apr 8 19:11:32 magi46 pam_slurm_adopt[20542]: debug2:
> _establish_config_source: using config_file=/run/slurm/conf/slurm.conf
> (cached)
> Apr 8 19:11:32 magi46 pam_slurm_adopt[20542]: debug: slurm_conf_init:
> using config_file=/run/slurm/conf/slurm.conf
> Apr 8 19:11:32 magi46 pam_slurm_adopt[20542]: debug: Reading slurm.conf
> file: /run/slurm/conf/slurm.conf
> Apr 8 19:11:32 magi46 pam_slurm_adopt[20542]: debug: Reading cgroup.conf
> file /run/slurm/conf/cgroup.conf
> Apr 8 19:11:32 magi46 pam_slurm_adopt[20542]: debug4: found
> StepId=203.batch
> Apr 8 19:11:32 magi46 pam_slurm_adopt[20542]: debug4: found StepId=203.0
> Apr 8 19:11:32 magi46 pam_slurm_adopt[20542]: send_user_msg: Access denied
> by pam_slurm_adopt: you have no active jobs on this node
> Apr 8 19:11:32 magi46 sshd[20542]: fatal: Access denied for user
> nicolas.greneche by PAM account configuration [preauth]
>
> I may have miss something, if you have some tips, I'll be delighted.
>
> In appendices, I give you the configuration of sshd pam on compute nodes and
> the slurm.conf :
>
> root at magi46:~# cat /etc/pam.d/sshd
> @include common-auth
> account required pam_nologin.so
> account required pam_access.so
> account required pam_slurm_adopt.so log_level=debug5
>
> @include common-account
> session [success=ok ignore=ignore module_unknown=ignore default=bad]
> pam_selinux.so close
> session required pam_loginuid.so
> session optional pam_keyinit.so force revoke
>
> @include common-session
> session optional pam_motd.so motd=/run/motd.dynamic
> session optional pam_motd.so noupdate
> session optional pam_mail.so standard noenv
> session required pam_limits.so
> session required pam_env.so
> session required pam_env.so user_readenv=1
> envfile=/etc/default/locale
> session [success=ok ignore=ignore module_unknown=ignore default=bad]
> pam_selinux.so open
>
> @include common-password
>
> root at slurmctld:~# cat /etc/slurm/slurm.conf
> ClusterName=magi
> ControlMachine=slurmctld
> SlurmUser=slurm
> AuthType=auth/munge
>
> MailProg=/usr/bin/mail
> SlurmdDebug=debug
>
> StateSaveLocation=/var/slurm
> SlurmdSpoolDir=/var/slurm
> SlurmctldPidFile=/var/slurm/slurmctld.pid
> SlurmdPidFile=/var/slurm/slurmd.pid
> SlurmdLogFile=/var/log/slurm/slurmd.log
> SlurmctldLogFile=/var/log/slurm/slurmctld.log
> SlurmctldParameters=enable_configless
>
> AccountingStorageHost=slurmctld
> JobAcctGatherType=jobacct_gather/linux
> AccountingStorageType=accounting_storage/slurmdbd
> AccountingStorageEnforce=associations
> JobRequeue=0
> SlurmdTimeout=600
>
> SelectType=select/cons_tres
> SelectTypeParameters=CR_CPU
>
> TmpFS=/scratch
>
> GresTypes=gpu
> PriorityType="priority/multifactor"
>
> Nodename=magi3 Boards=1 Sockets=2 CoresPerSocket=10 ThreadsPerCore=2
> State=UNKNOWN
> Nodename=magi[107] Boards=1 Sockets=2 CoresPerSocket=14 ThreadsPerCore=2
> RealMemory=92000 State=UNKNOWN
> Nodename=magi[46-53] Boards=1 Sockets=2 CoresPerSocket=10 ThreadsPerCore=2
> RealMemory=64000 State=UNKNOWN
>
> PartitionName=MISC-56c Nodes=magi107 Priority=3000 MaxTime=INFINITE State=UP
> PartitionName=COMPUTE Nodes=magi[46-53] Priority=3000 MaxTime=INFINITE
> State=UP Default=YES
>
> Thank you,
>
> --
> Nicolas Greneche
> USPN
> Support à la recherche / RSSI
> https://www-magi.univ-paris13.fr
>
More information about the slurm-users
mailing list