<html style="direction: ltr;">

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

    <style type="text/css">body p { margin-bottom: 0cm; margin-top: 0pt; } </style>

  </head>

  <body bidimailui-charset-is-forced="true" style="direction: ltr;"

    text="#000000" bgcolor="#FFFFFF">

    <p>Make tmpfs a TRES, and have NHC update that as in:</p>

    <p>scontrol update nodename=... gres=tmpfree:$(stat -f /tmp -c

      "%f*%S" | bc)"</p>

    <p>Replace /tmp with your tmpfs mount.</p>

    <p><br>

    </p>

    <p>You'll have to define that TRES in slurm.conf and gres.conf as

      usual (start with count=1 and have nhc update it)</p>

    <p><br>

    </p>

    <p>Do note that this is a simplistic example - updating like that

      will overwrite any other gres defined for the node, so you might

      wish to create an 'updategres' function that first reads in the

      node's current gres, only modifies the count of the fields you

      wish to modify, and returns a complete gres string.</p>

    <p> </p>

    <p><br>

    </p>

    <p>In sbatch do:</p>

    <p>sbatch --gres=tmpfree:20G</p>

    <p>And based on last update from NHC should only consider nodes with

      enough tmpfree for the job.<br>

    </p>

    <p><br>

    </p>

    <p>HTH</p>

    <p>--Dani_L.<br>

    </p>

    <p><br>

    </p>

    <div class="moz-cite-prefix">On 9/10/19 10:15 PM, Ole Holm Nielsen

      wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:9a83a697-cbae-2221-6f8f-107f6de8563b@fysik.dtu.dk">Hi

      Michael,

      <br>

      <br>

      Thanks for the suggestion!  We have user requests for certain

      types of jobs (quantum chemistry) that require fairly large local

      scratch space. Our jobs normally do not have this requirement.  So

      unfortunately the per-node NHC check doesn't seem to do the

      trick.  (We already have an NHC check "check_fs_used /scratch

      90%").

      <br>

      <br>

      Best regards,

      <br>

      Ole

      <br>

      <br>

      <br>

      On 10-09-2019 20:41, Michael Jennings wrote:

      <br>

      <blockquote type="cite">On Monday, 02 September 2019, at 20:02:57

        (+0200),

        <br>

        Ole Holm Nielsen wrote:

        <br>

        <br>

        <blockquote type="cite">We have some users requesting that a

          certain minimum size of the

          <br>

          *Available* (i.e., free) TmpFS disk space should be present on

          nodes

          <br>

          before a job should be considered by the scheduler for a set

          of

          <br>

          nodes.

          <br>

          <br>

          I believe that the "sbatch --tmp=size" option merely refers to

          the

          <br>

          TmpFS file system *Size* as configured in slurm.conf, and this

          is

          <br>

          *not* what users need.

          <br>

          <br>

          For example, a job might require 50 GB of *Available disk

          space* on

          <br>

          the TmpFS file system, which may however have only 20 GB out

          of 100

          <br>

          GB *Available* as shown by the df command, the rest having

          been

          <br>

          consumed by other jobs (present or past).

          <br>

          <br>

          However, when we do "scontrol show node <nodename>",

          only the TmpFS

          <br>

          file system *Size* is displayed as a "TmpDisk" number, but not

          the

          <br>

          *Available* number.

          <br>

          <br>

          Question: How can we get slurmd to report back to the

          scheduler the

          <br>

          amount of *Available* disk space?  And how can users specify

          the

          <br>

          minimum *Available* disk space required by their jobs

          submitted by

          <br>

          "sbatch"?

          <br>

          <br>

          If this is not feasible, are there other techniques that

          achieve the

          <br>

          same goal?  We're currently still at Slurm 18.08.

          <br>

        </blockquote>

        <br>

        Hi, Ole!

        <br>

        <br>

        I'm assuming you are wanting a per-job resolution on this rather

        than

        <br>

        per-node?  If per-node is good enough, you can of course use NHC

        to

        <br>

        check this, e.g.:

        <br>

           * || check_fs_free /tmp 50GB

        <br>

        <br>

        That doesn't work per-job, though, obviously.  Something that

        might

        <br>

        work, however, as a temporary work-around for this might be to

        have

        <br>

        the user run a single NHC command, like this:

        <br>

           srun --prolog='nhc -e "check_fs_free /tmp 50GB"'

        <br>

        <br>

        There might be some tweaks/caveats to this since NHC normally

        runs as

        <br>

        root, but just an idea....  :-)  An even crazier idea would be

        to set

        <br>

        NHC_LOAD_ONLY=1 in the environment, source /usr/sbin/nhc, and

        then

        <br>

        execute the shell function `check_fs_free` directly!  :-D

        <br>

      </blockquote>

      <br>

    </blockquote>

  </body>

</html>