[slurm-users] How can jobs request a minimum available (free) TmpFS disk space?

Bjørn-Helge Mevik b.h.mevik at usit.uio.no
Tue Sep 3 07:19:42 UTC 2019


We are facing more or less the same problem.  We have historically
defined a Gres "localtmp" with the number of GB initially available
on local disk, and then jobs ask for --gres=localtmp:50 or similar.

That prevents slurm from allocating jobs on the cluster if they ask for
more disk than is currently "free" -- in the sense of "not handed out to
a job".  But it doesn't prevent jobs from using more than they have
asked for, so the disk might have less (real) free space than slurm
thinks.

As far as I can see, cgroups does not support limiting used disk space,
only amount of IO/s and similar.

We are currently considering using file system quotas for enforcing
this.  Our localtmp disk is a separate xfs partition, and the idea is to
make the prolog set up a "project" disk quota for the job on the
localtmp file system, and the epilog to remove it again.

I'm not 100% sure we will make it work, but I'm hopeful.  Fingers
crossed! :)

-- 
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190903/d95c129b/attachment.sig>


More information about the slurm-users mailing list