[slurm-users] Cgroups and swap with 18.08.1?
John Hearns
hearnsj at googlemail.com
Tue Oct 16 03:09:13 MDT 2018
Bill, you know this already. But permit me an observation from PPBpro.
Turn up the logging level to maximum on the nodes. Tail the slurm log and
start a job.
Look HARD at exactly what the log is telling you - and as Richard Feynman
says you are the easiest person to fool.
Dont take the log to say what you think is happening - remember that log
messages take effort to put in the code,
well at least some keystrokes, so they usually mean something!
On Tue, 16 Oct 2018 at 10:04, John Hearns <hearnsj at googlemail.com> wrote:
> Rather dumb question from me - you have checked those processes are
> running within a cgroup?
> I have no experience in constraining the swap usage using cgroups, so
> sorry if I am adding nothing to the debate here.
>
> On Tue, 16 Oct 2018 at 04:49, Bill Broadley <bill at cse.ucdavis.edu> wrote:
>
>>
>> Greetings,
>>
>> I'm using ubuntu-18.04 and slurm-18.08.1 compiled from source.
>>
>> I followed the directions on:
>> https://slurm.schedmd.com/cgroups.html
>>
>> And:
>> https://slurm.schedmd.com/cgroup.conf.html
>>
>> That resulted in:
>> $ cat slurm.conf | egrep -i "cgroup|CR_"
>> ProctrackType=proctrack/cgroup
>> TaskPlugin=task/cgroup
>> SelectTypeParameters=CR_CPU_MEMORY
>> JobAcctGatherType=jobacct_gather/cgroup
>>
>> $ cat /etc/default/grub | grep GRUB_CMDLINE_LINUX=
>> GRUB_CMDLINE_LINUX='cgroup_enable=memory swapaccount=1 console=tty0
>> transparent_hugepage=madvise console=ttyS0,57600'
>>
>> $ cat cgroup.conf
>> CgroupAutomount=yes
>> ConstrainCores=yes
>> ConstrainDevices=yes
>> ConstrainRAMSpace=yes
>> ConstrainSwapSpace=yes
>> MaxSwapPercent=0
>> AllowedSwapSpace=0
>>
>> So I expect jobs to not use swap. Turns out if I run a 3GB ram process
>> with
>> sbatch --mem=1000 I just get a process that uses 1GB ram and 2GB of swap.
>>
>> So a 3GB process with --mem=1000:
>> $ ps acux
>> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
>> bill 17698 11.1 1.5 2817020 1015392 ? D 20:40 0:13 stream\
>>
>> $ smem
>> User Count Swap USS PSS RSS
>> bill 1 1795552 1017048 1017076 1018492
>>
>> With --mem=3000 zero swap is used and the job consumes 100% of a CPU.
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181016/125eb4b3/attachment-0001.html>
More information about the slurm-users
mailing list