[slurm-users] How can jobs request a minimum available (free) TmpFS disk space?

Tue Sep 3 09:17:55 UTC 2019

Dear Bjørn-Helge,

this is unfortunately no answer to the question but I'd be glad to
hear some more thoughts on that, too.

We are also going to implement disk quotas for the amount of local
scratch space that has been allocated for the job by means of generic
resources (e.g. `--gres=scratch:100´ for 100GB). This is especially
important when several users share a node.

This leads me to ask how you plan to determine the amount of local
scratch space allocated for the job from within its prolog and epilog
scripts. According to the documentation there does not seem to be any
environment variable available to prolog/epilog that indicates the
amount of any type of generic resources that has been allocated for
this job[1]. Am I missing something?

I already thought about running `scontrol show job $SLURM_JOB_ID´ from
within the prolog/epilog scripts in order to get that piece of information. 
In our Slurm test environment this reports (among other
information): 

[...]
TresPerNode=scratch:100
[...]

This line could eventually be parsed to get the amount of scratch
allocated for this job (and then further used to increase/decrease the
quota limits for the corresponding $SLURM_JOB_USER in the
prolog/epilog scripts). 

However, this still looks kind of clumsily to me and I wonder, if I
have just overlooked a more obvious, cleaner or more robust solution.

Since this is probably not an unusual requirement, I suppose this is
something that many other sites have already solved for themselves. No?

Best regards
Jürgen

[1] https://slurm.schedmd.com/prolog_epilog.html

-- 
Jürgen Salk
Scientific Software & Compute Services (SSCS)
Kommunikations- und Informationszentrum (kiz)
Universität Ulm
Telefon: +49 (0)731 50-22478
Telefax: +49 (0)731 50-22471

* Bjørn-Helge Mevik <b.h.mevik at usit.uio.no> [190903 09:19]:
> We are facing more or less the same problem.  We have historically
> defined a Gres "localtmp" with the number of GB initially available
> on local disk, and then jobs ask for --gres=localtmp:50 or similar.
> 
> That prevents slurm from allocating jobs on the cluster if they ask for
> more disk than is currently "free" -- in the sense of "not handed out to
> a job".  But it doesn't prevent jobs from using more than they have
> asked for, so the disk might have less (real) free space than slurm
> thinks.
> 
> As far as I can see, cgroups does not support limiting used disk space,
> only amount of IO/s and similar.
> 
> We are currently considering using file system quotas for enforcing
> this.  Our localtmp disk is a separate xfs partition, and the idea is to
> make the prolog set up a "project" disk quota for the job on the
> localtmp file system, and the epilog to remove it again.
> 
> I'm not 100% sure we will make it work, but I'm hopeful.  Fingers
> crossed! :)
> 
> -- 
> Regards,
> Bjørn-Helge Mevik, dr. scient,
> Department for Research Computing, University of Oslo

-- 
GPG A997BA7A | 87FC DA31 5F00 C885 0DC3  E28F BD0D 4B33 A997 BA7A