<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>You may want to look here:</p>
    <p><a class="moz-txt-link-freetext" href="https://slurm.schedmd.com/heterogeneous_jobs.html">https://slurm.schedmd.com/heterogeneous_jobs.html</a></p>
    <p>Brian Andrus<br>
    </p>
    <div class="moz-cite-prefix">On 12/7/2022 12:42 AM, Le, Viet Duc
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAMAsV6gBKReS2MxmHf28j9JU613739WmEUYBao6KE2VXT5981g@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">Dear
          slurm community, </p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br
            style="margin:0px;padding:0px">
        </p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">I
          am encountering a unique situation where I need to allocate
          jobs to nodes with different numbers of CPU cores. For
          instance: </p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">node01: 
          Xeon 6226 32 cores</p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">node02: 
          EPYC 7543 64 cores</p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br>
        </p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span
style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit">$ </span>salloc
--partition=all --nodes=2 --nodelist=gpu01,gpu02 --ntasks-per-node=32 --comment=etc<br
            style="margin:0px;padding:0px">
        </p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span
style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit">If
            --ntasks-per-node is larger than</span><span
style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit"> 32</span><span
style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit">,
            the job could not be allocated since node01 has only</span><span
style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit"> 32</span><span
style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit"> cores. </span><br
            style="margin:0px;padding:0px">
        </p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br
            style="margin:0px;padding:0px">
        </p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">In
          the context of NVIDIA's HPL container, we need to pin MPI
          processes according to NUMA affinity for best performance. </p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">For
          HGX-1, there are 8 A100s having affinity with 1st, 3rd, 5th,
          and 7th NUMA domain, respectively. </p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">With
          --ntasks-per-node=32, only the first half of EPYC's NUMA
          domain is available, and we had to assign the 4-7th A100 to
          0th and 2nd NUMA domain, leading to some performance
          degradation. </p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br
            style="margin:0px;padding:0px">
        </p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">I
          am looking for a way to request more tasks than the number of
          physically available cores, i.e.  </p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span
style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit">$ </span><span
style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit">salloc
            --partition=all</span><span
style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"> --nodes=2</span><span
style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"> --nodelist=gpu01,gpu02</span><span
style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"> --ntasks-per-node=64</span><span
style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"> </span><span
style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit">--comment=etc</span></p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span
style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"><br
              style="margin:0px;padding:0px">
          </span></p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span
style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit">Your
            suggestions are much appreciated. </span></p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br
            style="margin:0px;padding:0px">
        </p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">Regards, </p>
        <p
style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">Viet-Duc</p>
      </div>
    </blockquote>
  </body>
</html>