[slurm-users] Jobs escaping cgroup device controls after some amount of time.

Nate Coraor nate at bx.psu.edu
Mon Apr 30 14:52:16 MDT 2018


Nevermind - it appears to happen when puppet runs. I have no hand in that,
so I'll kick it to those admins and report back with what I find.

I ruled out slurm by simply creating a non-slurm cgroup, with e.g.
`cgcreate -g memory:test`, and that cgroup also disappeared unexpectedly.

--nate

On Mon, Apr 30, 2018 at 4:37 PM, Nate Coraor <nate at bx.psu.edu> wrote:

> Hi Shawn,
>
> I'm wondering if you're still seeing this. I've recently enabled
> task/cgroup on 17.11.5 running on CentOS 7 and just discovered that jobs
> are escaping their cgroups. For me this is resulting in a lot of jobs
> ending in OUT_OF_MEMORY that shouldn't, because it appears slurmd thinks
> the oom-killer has triggered when it hasn't. I'm not using GRES or devices,
> only:
>
> cgroup.conf:
>
> CgroupAutomount=yes
> ConstrainCores=yes
> ConstrainRAMSpace=yes
> ConstrainSwapSpace=yes
>
> slurm.conf:
>
> JobAcctGatherType=jobacct_gather/cgroup
> JobAcctGatherFrequency=task=15
> ProctrackType=proctrack/cgroup
> TaskPlugin=task/cgroup
>
> The only thing that seems to maybe correspond are the log messages:
>
> [JOB_ID.batch] debug:  Handling REQUEST_STATE
> debug:  _fill_registration_msg: found apparently running job JOB_ID
>
> Thanks,
> --nate
>
> On Mon, Apr 23, 2018 at 4:41 PM, Kevin Manalo <kmanalo at jhu.edu> wrote:
>
>> Shawn,
>>
>>
>>
>> Just to give you a compare and contrast:
>>
>>
>>
>> We have for related entries slurm.conf
>>
>>
>>
>> JobAcctGatherType=jobacct_gather/linux # will migrate to cgroup
>> eventually
>>
>> JobAcctGatherFrequency=30
>>
>> ProctrackType=proctrack/cgroup
>>
>> TaskPlugin=task/affinity,task/cgroup
>>
>>
>>
>> cgroup_allowed_devices_file.conf:
>>
>>
>>
>> /dev/null
>>
>> /dev/urandom
>>
>> /dev/zero
>>
>> /dev/sda*
>>
>> /dev/cpu/*/*
>>
>> /dev/pts/*
>>
>> /dev/nvidia*
>>
>>
>>
>> gres.conf (4 K80s on node with 24 core haswell):
>>
>>
>>
>> Name=gpu File=/dev/nvidia0 CPUs=0-5
>>
>> Name=gpu File=/dev/nvidia1 CPUs=12-17
>>
>> Name=gpu File=/dev/nvidia2 CPUs=6-11
>>
>> Name=gpu File=/dev/nvidia3 CPUs=18-23
>>
>>
>>
>>
>>
>> I also looked for multi-tenant jobs on our MARCC cluster with jobs > 1
>> day and they are still inside of cgroups, but again this is on CentOS6
>> clusters.
>>
>>
>>
>> Are you still seeing  cgroup escapes now, specifically for jobs > 1 day?
>>
>>
>>
>> Thanks,
>>
>> Kevin
>>
>>
>>
>>
>>
>>
>>
>> *From: *slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of
>> Shawn Bobbin <sabobbin at umiacs.umd.edu>
>> *Reply-To: *Slurm User Community List <slurm-users at lists.schedmd.com>
>> *Date: *Monday, April 23, 2018 at 2:45 PM
>> *To: *Slurm User Community List <slurm-users at lists.schedmd.com>
>> *Subject: *Re: [slurm-users] Jobs escaping cgroup device controls after
>> some amount of time.
>>
>>
>>
>> Hi,
>>
>>
>>
>> I attached our cgroup.conf and gres.conf.
>>
>>
>>
>> As for the cgroup_allowed_devices.conf file, I have this file stubbed but
>> empty.  In 17.02 slurm started fine without this file (as far as I
>> remember) and it being empty doesn’t appear to actually impact anything…
>> device availability remains the same.  Based on the behavior explained in
>> [0] I don’t expect this file to impact specific GPU containment.
>>
>>
>>
>> TaskPlugin = task/cgroup
>>
>> ProctrackType = proctrack/cgroup
>>
>> JobAcctGatherType = jobacct_gather/cgroup
>>
>>
>>
>>
>>
>>
>>
>> [0] https://bugs.schedmd.com/show_bug.cgi?id=4122
>>
>>
>>
>>
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180430/0bb01ed0/attachment.html>


More information about the slurm-users mailing list