<html style="direction: ltr;">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<style type="text/css">body p { margin-bottom: 0cm; margin-top: 0pt; } </style>
</head>
<body bidimailui-charset-is-forced="true" style="direction: ltr;"
text="#000000" bgcolor="#FFFFFF">
<p>Make tmpfs a TRES, and have NHC update that as in:</p>
<p>scontrol update nodename=... gres=tmpfree:$(stat -f /tmp -c
"%f*%S" | bc)"</p>
<p>Replace /tmp with your tmpfs mount.</p>
<p><br>
</p>
<p>You'll have to define that TRES in slurm.conf and gres.conf as
usual (start with count=1 and have nhc update it)</p>
<p><br>
</p>
<p>Do note that this is a simplistic example - updating like that
will overwrite any other gres defined for the node, so you might
wish to create an 'updategres' function that first reads in the
node's current gres, only modifies the count of the fields you
wish to modify, and returns a complete gres string.</p>
<p> </p>
<p><br>
</p>
<p>In sbatch do:</p>
<p>sbatch --gres=tmpfree:20G</p>
<p>And based on last update from NHC should only consider nodes with
enough tmpfree for the job.<br>
</p>
<p><br>
</p>
<p>HTH</p>
<p>--Dani_L.<br>
</p>
<p><br>
</p>
<div class="moz-cite-prefix">On 9/10/19 10:15 PM, Ole Holm Nielsen
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:9a83a697-cbae-2221-6f8f-107f6de8563b@fysik.dtu.dk">Hi
Michael,
<br>
<br>
Thanks for the suggestion! We have user requests for certain
types of jobs (quantum chemistry) that require fairly large local
scratch space. Our jobs normally do not have this requirement. So
unfortunately the per-node NHC check doesn't seem to do the
trick. (We already have an NHC check "check_fs_used /scratch
90%").
<br>
<br>
Best regards,
<br>
Ole
<br>
<br>
<br>
On 10-09-2019 20:41, Michael Jennings wrote:
<br>
<blockquote type="cite">On Monday, 02 September 2019, at 20:02:57
(+0200),
<br>
Ole Holm Nielsen wrote:
<br>
<br>
<blockquote type="cite">We have some users requesting that a
certain minimum size of the
<br>
*Available* (i.e., free) TmpFS disk space should be present on
nodes
<br>
before a job should be considered by the scheduler for a set
of
<br>
nodes.
<br>
<br>
I believe that the "sbatch --tmp=size" option merely refers to
the
<br>
TmpFS file system *Size* as configured in slurm.conf, and this
is
<br>
*not* what users need.
<br>
<br>
For example, a job might require 50 GB of *Available disk
space* on
<br>
the TmpFS file system, which may however have only 20 GB out
of 100
<br>
GB *Available* as shown by the df command, the rest having
been
<br>
consumed by other jobs (present or past).
<br>
<br>
However, when we do "scontrol show node <nodename>",
only the TmpFS
<br>
file system *Size* is displayed as a "TmpDisk" number, but not
the
<br>
*Available* number.
<br>
<br>
Question: How can we get slurmd to report back to the
scheduler the
<br>
amount of *Available* disk space? And how can users specify
the
<br>
minimum *Available* disk space required by their jobs
submitted by
<br>
"sbatch"?
<br>
<br>
If this is not feasible, are there other techniques that
achieve the
<br>
same goal? We're currently still at Slurm 18.08.
<br>
</blockquote>
<br>
Hi, Ole!
<br>
<br>
I'm assuming you are wanting a per-job resolution on this rather
than
<br>
per-node? If per-node is good enough, you can of course use NHC
to
<br>
check this, e.g.:
<br>
* || check_fs_free /tmp 50GB
<br>
<br>
That doesn't work per-job, though, obviously. Something that
might
<br>
work, however, as a temporary work-around for this might be to
have
<br>
the user run a single NHC command, like this:
<br>
srun --prolog='nhc -e "check_fs_free /tmp 50GB"'
<br>
<br>
There might be some tweaks/caveats to this since NHC normally
runs as
<br>
root, but just an idea.... :-) An even crazier idea would be
to set
<br>
NHC_LOAD_ONLY=1 in the environment, source /usr/sbin/nhc, and
then
<br>
execute the shell function `check_fs_free` directly! :-D
<br>
</blockquote>
<br>
</blockquote>
</body>
</html>