[slurm-users] ConstrainRAMSpace=yes and page cache?

Fri Jun 21 21:59:01 UTC 2019

I’ve suspected for some time that this matters in our environment, though we /do/ use GPFS. Maybe any use of local scratch (XFS, local drive) could figure in here?

Are there any tips for how to determine easily where the extra money is coming from, for example when the user has specifically constrained application to a certain amount of memory with its own flags, or – to put another way – to prove that it’s not this sort of phenomenon happening?

On Jun 21, 2019, at 13:04, Christopher Samuel <chris at csamuel.org<mailto:chris at csamuel.org>> wrote:

On 6/13/19 5:27 PM, Kilian Cavalotti wrote:

I would take a look at the various *KmemSpace options in cgroups.conf,
they can certainly help with this.

Specifically I think you'll want:

ConstrainKmemSpace=no

to fix this.  This happens for NFS and Lustre based systems, I don't think it's a problem for GPFS as mmfsd has its own pagepool separate to the processes address space.

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA

--
____
|| \\UTGERS,       |---------------------------*O*---------------------------
||_// the State     |         Ryan Novosielski - novosirj at rutgers.edu<mailto:novosirj at rutgers.edu>
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ     | Office of Advanced Research Computing - MSB C630, Newark
    `'
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190621/da105430/attachment-0001.html>