[slurm-users] memory limits:: why job is not killed but oom-killer steps up?

Adrian Sevcenco Adrian.Sevcenco at spacescience.ro
Wed Jan 12 22:04:34 UTC 2022


Hi! I have a problem with the enforcing the memory limits...
I'm using the cgroup to enforce the limits and i had expected that when
cgroup memory limits are reach the job is killed ..
instead i see in log a lot of oom-killer reports that act only a certain process
from cgroup ...

Did i missed anything in my configuration? I have the following:

SelectType=select/cons_res
SelectTypeParameters=CR_CPU_MEMORY,CR_LLN

the partition have:
DefMemPerCPU=3950 MaxMemPerCPU=4010  (i understood that these are MiB, and physically i have 4GiB/thread)

cat cgroup.conf
CgroupAutomount=yes
TaskAffinity=no
ConstrainCores=yes
ConstrainRAMSpace=yes

ProctrackType=proctrack/cgroup

JobAcctGatherType=jobacct_gather/linux
JobAcctGatherFrequency=task=15,filesystem=120
JobAcctGatherParams=UsePss

TaskPlugin=task/affinity,task/cgroup
TaskPluginParam=autobind=threads

Is there a problem with my expectation that i should not see oom-killer?
or with my configuration?

Thank you!
Adrian



More information about the slurm-users mailing list