[slurm-users] pam_slurm_adopt and memory constraints?

Juergen Salk juergen.salk at uni-ulm.de
Mon Jul 15 20:07:04 UTC 2019


* Andy Georges <andy.georges at ugent.be> [190715 16:17]:
> 
> On Fri, Jul 12, 2019 at 03:21:31PM +0200, Juergen Salk wrote:
> > Dear all,
> >
> > I have configured pam_slurm_adopt in our Slurm test environment by
> > following the corresponding documentation:
> >
> >  https://slurm.schedmd.com/pam_slurm_adopt.html
> >
> > I've set `PrologFlags=contain´ in slurm.conf and also have task/cgroup
> > enabled along with task/affinity (i.e. `TaskPlugin=task/affinity,task/cgroup´).
> 
> <snip>
> 
> > Thus, the ssh session seems to be totally unconstrained by cgroups in
> > terms of memory usage. 
> > <snip> 
> 
> I think we opened an issue for this at https://bugs.schedmd.com/show_bug.cgi?id=5920
> with a proposed fix.

Hi Andy,

thank you very much. I think this points into the right direction. At
least I can confirm that pam_systemd does interfere with pam_slurm_adopt.

pam_systemd comes along with the systemd-libs rpm package on {RHEL,CentOS}7:

$ rpm -qf /usr/lib64/security/pam_systemd.so
systemd-libs-219-57.el7_5.3.x86_64

And pam_systemd is enabled by default in /etc/pam.d/system-auth which
comes with the pam rpm package:

$ cat /etc/pam.d/system-auth
#%PAM-1.0
# This file is auto-generated.
# User changes will be destroyed the next time authconfig is run.
auth        required      pam_env.so
auth        sufficient    pam_unix.so try_first_pass nullok
auth        required      pam_deny.so

account     required      pam_unix.so

password    requisite     pam_pwquality.so try_first_pass local_users_only retry=3 authtok_type=
password    sufficient    pam_unix.so try_first_pass use_authtok nullok sha512 shadow
password    required      pam_deny.so

session     optional      pam_keyinit.so revoke
session     required      pam_limits.so
-session     optional      pam_systemd.so
session     [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid
session     required      pam_unix.so

This is what I get after ssh-ing into the node with the pam configuration 
from above:

$ grep pam_systemd /etc/pam.d/system-auth
-session     optional      pam_systemd.so
$ cat /proc/self/cgroup 
11:cpuset:/slurm/uid_900002/job_375/step_extern
10:hugetlb:/
9:perf_event:/
8:devices:/user.slice
7:net_prio,net_cls:/
6:cpuacct,cpu:/user.slice
5:pids:/user.slice
4:blkio:/user.slice
3:memory:/user.slice
2:freezer:/slurm/uid_900002/job_375/step_extern
1:name=systemd:/user.slice/user-900002.slice/session-38512.scope

And this is what I get after ssh-ing into the very same node 
with pam_systemd disabled (commented out) in /etc/pam.d/system-auth:

$ grep pam_systemd /etc/pam.d/system-auth
# -session     optional      pam_systemd.so
$ cat /proc/self/cgroup 
11:cpuset:/slurm/uid_900002/job_375/step_extern
10:hugetlb:/
9:perf_event:/
8:devices:/system.slice/sshd.service
7:net_prio,net_cls:/
6:cpuacct,cpu:/system.slice/sshd.service
5:pids:/system.slice/sshd.service
4:blkio:/system.slice/sshd.service
3:memory:/slurm/uid_900002/job_375/step_extern
2:freezer:/slurm/uid_900002/job_375/step_extern
1:name=systemd:/system.slice/sshd.service
$

> This is on slurm 17.11, but SchedMD promised they'd pick it up for
> inclusion in 19.05.x.

I haven't looked into 19.05.x so far. 

> It does require some changes to the slurm code, which are here (for
> 17.11) https://github.com/hpcugent/slurm/pull/28/files
> 
> I hope this helps you out a bit.

Yes it does. Thanks again.

Best regards
Jürgen





More information about the slurm-users mailing list