[slurm-users] memory limits:: why job is not killed but oom-killer steps up?
Hermann Schwärzler
hermann.schwaerzler at uibk.ac.at
Thu Jan 13 08:59:19 UTC 2022
Hi Adrian,
ConstrainRAMSpace=yes
has the effect that when the memory the job requested is exhausted the
processes of the job will start paging/swapping.
If you want to stop jobs that use more memory (RSS to be precise) than
they reqeusted, you have to add this to your cgroup.conf:
ConstrainSwapSpace=yes
AllowedSwapSpace=0
Regards,
Hermann
On 1/12/22 11:04 PM, Adrian Sevcenco wrote:
>
> Hi! I have a problem with the enforcing the memory limits...
> I'm using the cgroup to enforce the limits and i had expected that when
> cgroup memory limits are reach the job is killed ..
> instead i see in log a lot of oom-killer reports that act only a certain
> process
> from cgroup ...
>
> Did i missed anything in my configuration? I have the following:
>
> SelectType=select/cons_res
> SelectTypeParameters=CR_CPU_MEMORY,CR_LLN
>
> the partition have:
> DefMemPerCPU=3950 MaxMemPerCPU=4010 (i understood that these are MiB,
> and physically i have 4GiB/thread)
>
> cat cgroup.conf
> CgroupAutomount=yes
> TaskAffinity=no
> ConstrainCores=yes
> ConstrainRAMSpace=yes
>
> ProctrackType=proctrack/cgroup
>
> JobAcctGatherType=jobacct_gather/linux
> JobAcctGatherFrequency=task=15,filesystem=120
> JobAcctGatherParams=UsePss
>
> TaskPlugin=task/affinity,task/cgroup
> TaskPluginParam=autobind=threads
>
> Is there a problem with my expectation that i should not see oom-killer?
> or with my configuration?
>
> Thank you!
> Adrian
>
More information about the slurm-users
mailing list