[slurm-users] Slurm does not set memory.limit_in_bytes for tasks (but does for steps)

Jacob Chappell jacob.chappell at uky.edu
Wed Jun 23 14:49:45 UTC 2021


Hi Marcus,

That makes sense, thanks! I suppose then (for monitoring purposes, for
example, without probing scontrol/sacct) if you wanted to figure out the
true maximum memory limit for a task, you'd need to walk up the hierarchy
and take whatever the smallest value you find is.

__________________________________________________
*Jacob D. Chappell, CSM*
Research Computing | Research Computing Infrastructure
Information Technology Services | University of Kentucky
jacob.chappell at uky.edu


On Wed, Jun 23, 2021 at 6:32 AM Marcus Wagner <wagner at itc.rwth-aachen.de>
wrote:

> CAUTION: External Sender
>
>
>
>
> ---------- Forwarded message ----------
> From: Marcus Wagner <wagner at itc.rwth-aachen.de>
> To: <slurm-users at lists.schedmd.com>
> Cc:
> Bcc:
> Date: Wed, 23 Jun 2021 13:30:14 +0200
> Subject: Re: [slurm-users] Slurm does not set memory.limit_in_bytes for
> tasks (but does for steps)
> Hi Jacob,
>
> I generally think, that that is the better way.
> If you have e.g. tasks with different memory needs, Slurm (or the
> oom_killer to be precise) would kill the job, if that limit gets exceeded.
> If the limit is set for the step, the tasks can "steal" memory from each
> other.
>
>
> Best
> Marcus
>
> Am 22.06.2021 um 18:46 schrieb Jacob Chappell:
> > Hello everyone,
> >
> > I came across a weird behavior and was wondering if this is a bug,
> oversight, or intended?
> >
> > It appears that Slurm does not set memory.limit_in_bytes at the task
> level, but it does set it at the step level and above. Observe:
> >
> > $ grep memory /proc/$$/cgroup
> > 10:memory:/slurm/uid_2001/job_304876/step_0/task_0
> >
> > $ cd /sys/fs/cgroup/memory/slurm/uid_2001/job_304876/step_0/task_0
> >
> > $ cat memory.limit_in_bytes
> > 9223372036854771712     <--- basically unlimited
> >
> > But lets check the parent:
> >
> > $ cat ../memory.limit_in_bytes
> > 33554432000      <-- set properly to around 32 GB, see below
> >
> > $ scontrol show job 304876 | grep mem=
> >     TRES=cpu=8,mem=*32000M*,node=1,billing=8
> >
> > Now, it does appear that the task is still limited to the step's memory
> limit given the hierarchical nature of cgroups, but I just wanted to
> mention this anyway and see if anyone had any thoughts.
> >
> > Thanks,
> > __________________________________________________
> > *Jacob D. Chappell, CSM*
> > Research Computing | Research Computing Infrastructure
> > Information Technology Services | University of Kentucky
> > jacob.chappell at uky.edu <mailto:jacob.chappell at uky.edu>
>
> --
> Dipl.-Inf. Marcus Wagner
>
> IT Center
> Gruppe: Systemgruppe Linux
> Abteilung: Systeme und Betrieb
> RWTH Aachen University
> Seffenter Weg 23
> 52074 Aachen
> Tel: +49 241 80-24383
> Fax: +49 241 80-624383
> wagner at itc.rwth-aachen.de
> www.itc.rwth-aachen.de
>
> Social Media Kanäle des IT Centers:
> https://blog.rwth-aachen.de/itc/
> https://www.facebook.com/itcenterrwth
> https://www.linkedin.com/company/itcenterrwth
> https://twitter.com/ITCenterRWTH
> https://www.youtube.com/channel/UCKKDJJukeRwO0LP-ac8x8rQ
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210623/512ab662/attachment.htm>


More information about the slurm-users mailing list