[slurm-users] slurm, memory accounting and memory mapping

Bjørn-Helge Mevik b.h.mevik at usit.uio.no
Fri Jan 11 08:47:20 UTC 2019


Sergey Koposov <skoposov at cmu.edu> writes:

> The trick is that my code uses memory mapping (i.e. mmap) of one
> single large file (~12 Gb) in each thread on each node.
> With this technique in the past despite the fact the file is
> (read-only) mmaped in say 16 threads, the actual memory footprint was
> still ~ 12 Gb.
> However, when I now do this in slurm, it thinks that each thread (or
> process) takes 12Gb and kills my processes. 

We've seen this too (at least with older versions of Slurm; I haven't
checked lately).  Our way around it was to set

JobAcctGatherParams=NoOverMemoryKill

and use the cgroup task plugin (TaskPlugin=task/cgroup).  The cgroup
plugin will kill jobs if they exceed their limits (provided you have
set up cgroup.conf to do it), but does not have the same problem of
counting shared memory segments/mmap'ed files once for each
thread/process.  The NoOverMemoryKill tells Slurm itself not to kill the
job, but leave it to the TaskPlugin.

-- 
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190111/3367f9db/attachment-0001.sig>


More information about the slurm-users mailing list