[slurm-users] pam_slurm_adopt does not constrain memory?
    Christian Peter 
    christian.peter at itwm.fraunhofer.de
       
    Wed Aug 22 08:27:59 MDT 2018
    
    
  
hi,
we observed a strange behavior of pam_slurm_adopt regarding the 
involved cgroups:
when we start a shell as a new Slurm job using "srun", the process has 
freezer, cpuset and memory cgroups setup as e.g. 
"/slurm/uid_5001/job_410318/step_0". that's good!
however, another shell started by an SSH login is handled by 
pam_slurm_adopt. that process is only affected by the freezer and 
cpuset cgroups setup as "/slurm/uid_5001/job_410318/step_extern". it 
lacks the configuration of the "memory" cgroup. (see output below)
as a consequence, all tools started from this shell prompt are not 
affected by any memory restrictions. that's bad for our use case as we 
need to partition the memory of our SMP machines for several 
independent jobs/users.
is this an expected behavior of pam_slurm_adopt/slurmstepd?
or maybe a configuration issue? did i miss something?
a bug? to me, it looks similar to this old issue...
https://bugs.schedmd.com/show_bug.cgi?id=2236
we're currently running Slurm 17.11.8. (we've already seen this with 
our previous version 17.11.5.)
thanks for your help and suggestions!
   christian
--------------------------------------
== cgroups within srun ==
login$ srun --pty bash
node064$ cat /proc/self/cgroup
11:pids:/system.slice/slurmd.service
10:freezer:/slurm/uid_501/job_410318/step_0
9:cpuset:/slurm/uid_501/job_410318/step_0
8:cpuacct,cpu:/system.slice/slurmd.service
7:net_prio,net_cls:/
6:blkio:/system.slice/slurmd.service
5:perf_event:/
4:devices:/system.slice/slurmd.service
3:memory:/slurm/uid_501/job_410318/step_0
2:hugetlb:/
1:name=systemd:/system.slice/slurmd.service
== cgroups for external step ==
login$ ssh node064
node064$ cat /proc/self/cgroup
11:pids:/user.slice
10:freezer:/slurm/uid_501/job_410318/step_extern
9:cpuset:/slurm/uid_501/job_410318/step_extern
8:cpuacct,cpu:/user.slice
7:net_prio,net_cls:/
6:blkio:/user.slice
5:perf_event:/
4:devices:/user.slice
3:memory:/user.slice
2:hugetlb:/
1:name=systemd:/user.slice/user-501.slice/session-430.scope
-------------- next part --------------
A non-text attachment was scrubbed...
Name: christian_peter.vcf
Type: text/x-vcard
Size: 307 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180822/8e06f8b8/attachment.vcf>
    
    
More information about the slurm-users
mailing list