[slurm-users] Per-job TMPDIR: how to lookup gres allocation in prolog?

Mark Dixon mark.c.dixon at durham.ac.uk
Tue Nov 16 16:49:34 UTC 2021


Hi everyone,

I'd like to configure slurm such that users can request an amount of disk 
space for TMPDIR... and for that request to be reserved and quota'd via 
commands like "sbatch --gres tmp:10G jobscript.sh". Probably reinventing 
someone's wheel, but I'm almost there.

I have:

- created a local xfs filesystem, dedicated to per-job TMPDIR directories,
   with project quotas enabled on each slurmd host.

- created (slurmd) Prolog/Epilog scripts which create/delete a per-job
   directory on the xfs filesystem, owned by the job user.

- created SrunProlog/TaskProlog scripts, which set TMPDIR in the user's
   job environment to point at the per-job directory.

- added a gres defined as "Name=tmp Flags=CountOnly"

- modified the node definitions to include the amount of storage on each
   host, by adding "Gres=tmp:270G".

I still need to:

- extend the Prolog script to lookup the "tmp" gres allocation for the
   job.

- extend the Prolog script to set the appropriate project quota on the
   per-job TMPDIR, limiting the amount of space the directory tree can use.


Unfortunately, I've not found anything in the Prolog environment (or 
stored on disk under /var/spool/slurmd) containing the gres allocations 
for the job.

I figure I can do a "scontrol show job <jobid> -d" from inside the prolog 
to get the job's gres information, but I'll need to hard-code the location 
of the scontrol binary... and the Prolog documentation explicitly tells 
you not to execute slurm commands from within the prolog.

Is there a better way to get the job's gres information from within the 
prolog, please?

Thanks!

Mark



More information about the slurm-users mailing list