[slurm-users] Per-job TMPDIR: how to lookup gres allocation in prolog?
Mark Dixon
mark.c.dixon at durham.ac.uk
Tue Nov 16 16:49:34 UTC 2021
Hi everyone,
I'd like to configure slurm such that users can request an amount of disk
space for TMPDIR... and for that request to be reserved and quota'd via
commands like "sbatch --gres tmp:10G jobscript.sh". Probably reinventing
someone's wheel, but I'm almost there.
I have:
- created a local xfs filesystem, dedicated to per-job TMPDIR directories,
with project quotas enabled on each slurmd host.
- created (slurmd) Prolog/Epilog scripts which create/delete a per-job
directory on the xfs filesystem, owned by the job user.
- created SrunProlog/TaskProlog scripts, which set TMPDIR in the user's
job environment to point at the per-job directory.
- added a gres defined as "Name=tmp Flags=CountOnly"
- modified the node definitions to include the amount of storage on each
host, by adding "Gres=tmp:270G".
I still need to:
- extend the Prolog script to lookup the "tmp" gres allocation for the
job.
- extend the Prolog script to set the appropriate project quota on the
per-job TMPDIR, limiting the amount of space the directory tree can use.
Unfortunately, I've not found anything in the Prolog environment (or
stored on disk under /var/spool/slurmd) containing the gres allocations
for the job.
I figure I can do a "scontrol show job <jobid> -d" from inside the prolog
to get the job's gres information, but I'll need to hard-code the location
of the scontrol binary... and the Prolog documentation explicitly tells
you not to execute slurm commands from within the prolog.
Is there a better way to get the job's gres information from within the
prolog, please?
Thanks!
Mark
More information about the slurm-users
mailing list