<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>I think this is exactly the type of use case heterogeneous job

      support is for, which has been supported since Slurm 17.11</p>

    <p>

      <blockquote type="cite">Slurm version 17.11 and later supports the

        ability to submit and manage

        heterogeneous jobs, in which each component has virtually all

        job options

        available including partition, account and QOS (Quality Of

        Service).

        For example, part of a job might require four cores and 4 GB for

        each of 128

        tasks while another part of the job would require 16 GB of

        memory and one CPU.</blockquote>

      <br>

      <a class="moz-txt-link-freetext" href="https://slurm.schedmd.com/heterogeneous_jobs.html">https://slurm.schedmd.com/heterogeneous_jobs.html</a></p>

    <p>Using this, you should be able to use a single core for the

      transfer from NFS , use all the cores/GPUs you need for the

      computation, and then use 1 single core to transfer back to NFS: <br>

    </p>

    <p>Disclaimer: I've never used this feature myself. <br>

    </p>

    <pre class="moz-signature" cols="72">Prentice</pre>

    <div class="moz-cite-prefix">On 4/3/21 5:31 PM, Fulcomer, Samuel

      wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAOORAuF_g8ZnipafqFE-73jpfwybd0y86bwm7a2P5xL_mSdYLw@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">

        <div>inline below...</div>

        <br>

        <div class="gmail_quote">

          <div dir="ltr" class="gmail_attr">On Sat, Apr 3, 2021 at 4:50

            PM Will Dennis <<a href="mailto:wdennis@nec-labs.com"

              moz-do-not-send="true">wdennis@nec-labs.com</a>> wrote:<br>

          </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px

            0.8ex;border-left:1px solid

            rgb(204,204,204);padding-left:1ex">

            <div style="overflow-wrap: break-word;" lang="EN-US">

              <div class="gmail-m_7889209934540133168WordSection1">

                <p class="MsoNormal">Sorry, obvs wasn’t ready to send

                  that last message yet…</p>

                <p class="MsoNormal"> </p>

                <p class="MsoNormal">Our issue is the shared storage is

                  via NFS, and the “fast storage in limited supply” is

                  only local on each node. Hence the need to copy it

                  over from NFS (and then remove it when finished with

                  it.)<br>

                  <br>

                  I also wanted the copy & remove to be different

                  jobs, because the main processing job usually requires

                  GPU gres, which is a time-limited resource on the

                  partition. I don’t want to tie up the allocation of

                  GPUs while the data is staged (and removed), and if

                  the data copy fails, don’t want to even progress to

                  the job where the compute happens (so like,

                  copy_data_locally && process_data)</p>

              </div>

            </div>

          </blockquote>

          <div><br>

          </div>

          <div>...yup... this is the problem. We've invested in GPFS and

            an NVMe Excelero pool (for initial placement); however, we

            still have the problem of having users pull down data from

            community repositories before running useful computation.</div>

          <div><br>

          </div>

          <div>Your question has gotten me thinking about this more. In

            our case, all of our nodes are diskless, so this wouldn't

            really work for us (but we do have fast GPFS), but.... if

            your fast storage is only local to your nodes, the

            subsequent compute jobs will need to request those specific

            nodes, so you'll need to have a mechanism to increase the

            SLURM scheduling  "weight" of the nodes after staging, so

            the scheduler won't select them over nodes with a lower

            weight. That could be done in a job epilog.</div>

          <div><br>

          </div>

          <div><br>

          </div>

          <div><br>

          </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px

            0.8ex;border-left:1px solid

            rgb(204,204,204);padding-left:1ex">

            <div style="overflow-wrap: break-word;" lang="EN-US">

              <div class="gmail-m_7889209934540133168WordSection1">

                <p class="MsoNormal"> </p>

                <div>

                  <blockquote

style="border-top:none;border-right:none;border-bottom:none;border-left:1pt

                    solid rgb(204,204,204);padding:0in 0in 0in

                    6pt;margin:5pt 0in 5pt 4.8pt">

                    <div>

                      <div>

                        <div>

                          <div>

                            <p class="MsoNormal"

                              style="margin-right:0in;margin-bottom:5pt;margin-left:0in">

                              <span style="color:black">If you've got

                                other fast storage in limited supply

                                that can be used for data that can be

                                staged, then by all means use it, but

                                consider whether you want batch cpu

                                cores tied up with the wall time of

                                transferring the data. This could easily

                                be done on a time-shared frontend login

                                node from which the users could then

                                submit (via script) jobs after the data

                                was staged. Most of the transfer

                                wallclock is in network wait, so don't

                                waste dedicated cores for it.</span></p>

                            <p class="MsoNormal"

                              style="margin-right:0in;margin-bottom:5pt;margin-left:0in">

                              <span

                                style="font-size:13.5pt;font-family:-webkit-standard,serif;color:black"> </span></p>

                          </div>

                        </div>

                      </div>

                    </div>

                  </blockquote>

                </div>

              </div>

            </div>

          </blockquote>

        </div>

      </div>

    </blockquote>

  </body>

</html>