[slurm-users] How can jobs request a minimum available (free) TmpFS disk space?

Bjørn-Helge Mevik b.h.mevik at usit.uio.no
Tue Sep 3 10:24:03 UTC 2019


Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk> writes:

> I figured that other sites need the free disk space feature as well
> :-)

:)

> How do you dynamically update your gres=localtmp resource according to
> the current disk free space?  I mean, there is already a TmpFS disk
> space size defined in slurm.conf, so how does your gres=localtmp
> differ from TmpFS?

We simply define the total "count" in the NodeLines for the compute nodes, like

    Nodename=c11-[1-36] Gres=localtmp:170 ...

for nodes with 170 GB disk.

Then Slurm will do the rest; it will keep track of these 170 localtmp
"units" and not hand out more than that to jobs.  The jobs just specify
--gres=localtmp:50 for 50 "units".  (Slurm doesn't know how much disk
there is, or even that "localtmp" refers to disk space, it only keeps
count of the units in the Gres definition, so we could have chosen MB as
units (or multipla of Pi, if we really wanted :) ).

So we don't use the TmpFS setting at all.  In our prolog, when a job has
asked for "localtmp", we create a directory for the job
(/localscratch/$SLURM_JOB_ID), and set an environment variable $LOCALTMP
to that directory, so the user can do "cp mydata $LOCALTMP" etc. in the
jobs script.  Then in the epilog, we delete the area.

The new thing we are looking into, then, is to set a "project" quota
(A.K.A folder quota) for the $LOCALTMP directory, and clear the quota
afterwards.  xfs supports this, and ext4 with recent enough version of
the e2fsprogs toolkit.

> With "scontrol show node xxx" we get the node memory values such as
> "RealMemory=256000 AllocMem=240000 FreeMem=160056".  Similarly it
> would be great to augment the TmpDisk with a FreeDisk parameter, for
> example "TmpDisk=140000 FreeDisk=90000".

That would have been nice, yes.

> Would a Slurm modification be required to include a FreeDisk
> parameter, and then change the meaning of "sbatch --tmp=xxx" to refer
> to the FreeDisk in stead of TmpDisk size?

I think it will, yes.

-- 
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190903/530aeae1/attachment.sig>


More information about the slurm-users mailing list