[slurm-users] Cgroups not constraining memory & cores

Fri Nov 11 16:48:11 UTC 2022

Hi,

Many thanks for that pointer Sean. I had missed the PrologFlags=Contain setting so have added it to slurm.conf now.

I've also explicitly built slurm with pam support:

../configure --sysconfdir=/home/support/pkgs/slurm/etc --prefix=/home/support/pkgs/slurm/ubuntu_20.04/21.08.8-2 --localstatedir=/var/run/slurm --enable-pam && make && make install install

It appears to me as if slurm tasks are launching within cgroups.

E.g. if I do

srun --mem=100 sleep 300&

And login to the node I can see memory limits for cgroups:

$ cat /sys/fs/cgroup/memory/slurm/uid_5446/memory.limit_in_bytes
9223372036854771712
$ cat /sys/fs/cgroup/memory/slurm/uid_5446/job_24/memory.limit_in_bytes
269907656704
$ cat /sys/fs/cgroup/memory/slurm/uid_5446/job_24/step_0/memory.limit_in_bytes 
269907656704

But if I do this to over allocate memory it still allows me to:

srun --mem=100 stoopid-memory-overallocation.x

More memory is being allocated by the node than should be allowed.

I'm clearly doing something wrong here. Can anyone point out what it is please? Am I just using the wrong test methodology?

Thanks in advance 

Sean

November 8, 2022 1:48 PM, "Sean Maxwell" <stm at case.edu (mailto:stm at case.edu?to=%22Sean%20Maxwell%22%20<stm at case.edu>)> wrote:
Hi Sean,

I don't see PrologFlags=Contain in your slurm.conf. It is one of the entries required to activate the cgroup containment: https://slurm.schedmd.com/cgroup.conf.html#OPT_/etc/slurm/slurm.conf (https://slurm.schedmd.com/cgroup.conf.html#OPT_/etc/slurm/slurm.conf)

Best,

-Sean 
On Tue, Nov 8, 2022 at 8:16 AM Sean McGrath <smcgrat at tchpc.tcd.ie (mailto:smcgrat at tchpc.tcd.ie)> wrote: Hi,

I can't get cgroups to constrain memory or cores. If anyone is able to point out what I am doing wrong I would be very grateful please.

Testing:

Request a core and 2G of memory, log into it and compile a binary that just allocates memory quickly:

$ salloc -n 1 --mem=2G
$ ssh $SLURM_NODELIST
$ cat stoopid-memory-overallocation.c
/*
*
* Sometimes you need to over allocate the memory available to you.
* This does so splendidly. I just hope you have limits set to kill it!
*
*/

int main()
{
while(1)
{
void *m = malloc(1024*1024);
memset(m,0,1024*1024);
}
return 0;
}
$ gcc -o stoopid-memory-overallocation.x stoopid-memory-overallocation.c

Checking memory usage before as a baseline:

$ free -g
total used free shared buff/cache available
Mem: 251 1 246 0 3 248
Swap: 7 0 7

Launch the memory over allocation and check memory use subsequently and see that 34G has been allocated when I expect it to be constrained to 2G:

$ ./stoopid-memory-overallocation.x &
$ sleep 10 && free -g
total used free shared buff/cache available
Mem: 251 34 213 0 3 215
Swap: 7 0 7

Run another process to check cpu constraints:

$ ./stoopid-memory-overallocation.x &

Check it with top and I can see that the 2 processes are running simultaneously:

$ top
top - 13:04:44 up 13 days, 23:39, 2 users, load average: 0.63, 0.27, 0.11
Tasks: 525 total, 3 running, 522 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.7 us, 5.5 sy, 0.0 ni, 93.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 257404.1 total, 181300.3 free, 72588.6 used, 3515.2 buff/cache
MiB Swap: 8192.0 total, 8192.0 free, 0.0 used. 183300.3 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
120978 smcgrat 20 0 57.6g 57.6g 968 R 100.0 22.9 0:22.63 stoopid-memory-
120981 smcgrat 20 0 11.6g 11.6g 952 R 100.0 4.6 0:04.57 stoopid-memory-
...

Is this actually a valid test case or am I doing something else wrong?

Thanks

Sean

Setup details:

Ubuntu 20.04.5 LTS (Focal Fossa).
slurm 21.08.8-2.
cgroup-tools version 0.41-10 installed.

The following was set in /etc/default/grub and update-grub run:

GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1"

Relevant parts of scontrol show conf

JobAcctGatherType = jobacct_gather/none
ProctrackType = proctrack/cgroup
TaskPlugin = task/cgroup
TaskPluginParam = (null type)
The contents of the full slurm.conf

ClusterName=neuro
SlurmctldHost=neuro01(192.168.49.254)
AuthType=auth/munge
CommunicationParameters=block_null_hash
CryptoType=crypto/munge
Epilog=/home/support/slurm/etc/slurm.epilog.local
EpilogSlurmctld=/home/support/slurm/etc/slurm.epilogslurmctld
JobRequeue=0
MaxJobCount=30000
MpiDefault=none
Prolog=/home/support/slurm/etc/prolog
ReturnToService=2
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
SlurmdPort=6818
SlurmUser=root
StateSaveLocation=/var/slurm_state/neuro
SwitchType=switch/none
TaskPlugin=task/cgroup
ProctrackType=proctrack/cgroup
RebootProgram=/sbin/reboot
InactiveLimit=0
KillWait=30
MinJobAge=300
SlurmctldTimeout=300
SlurmdTimeout=300
Waittime=0
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_Core
AccountingStorageHost=service01
AccountingStorageType=accounting_storage/slurmdbd
JobCompType=jobcomp/none
JobAcctGatherFrequency=30
SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm.log
SlurmdDebug=3
SlurmdLogFile=/var/log/slurm.log
DefMemPerNode=257300
MaxMemPerNode=257300
NodeName=neuro-n01-mgt RealMemory=257300 Sockets=2 CoresPerSocket=16 State=UNKNOWN
NodeName=neuro-n02-mgt RealMemory=257300 Sockets=2 CoresPerSocket=16 State=UNKNOWN
PartitionName=compute Nodes=ALL Default=YES MaxTime=5760 State=UP Shared=YES
cgroup.conf file contents:

CgroupAutomount=yes
ConstrainCores=yes
ConstrainRAMSpace=yes
TaskAffinity=no
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20221111/0de79be6/attachment-0001.htm>