<div dir="ltr">Hi Brian, <div><br></div><div>Thanks for suggesting this interesting feature of Slurm. <br></div><div>And sorry for the late follow up since I only had access to the cluster for a short time. </div><div><br></div><div>We were now able to perform HPL benchmark across different partitions with correct NUMA affinity. </div><div>For future reference, I put the procedure here: </div><div><br></div><div>$ salloc \</div><div>       --partition=v100 --nodes=1 --ntasks-per-node=40 --gres=gpu:4 : \</div><div>       --partition=a100 --nodes=1 --ntasks-per-node=64 --gres=gpu:8 <br></div><div><br></div><div>$ srun \ </div><div>       -n 4 : \ </div><div>       -n 8   \</div><div>       hpl.sh </div><div><br></div><div>Initially we thought there would be some performance degradation when mixing partitions.</div><div>But at least for small scale test, this seems to be negligible. </div><div><br></div><div>Thanks. </div><div>Viet-Duc  </div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Dec 8, 2022 at 2:27 AM Brian Andrus <<a href="mailto:toomuchit@gmail.com" target="_blank">toomuchit@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div>
    <p>You may want to look here:</p>
    <p><a href="https://slurm.schedmd.com/heterogeneous_jobs.html" target="_blank">https://slurm.schedmd.com/heterogeneous_jobs.html</a></p>
    <p>Brian Andrus<br>
    </p>
    <div>On 12/7/2022 12:42 AM, Le, Viet Duc
      wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">Dear
          slurm community, </p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br style="margin:0px;padding:0px">
        </p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">I
          am encountering a unique situation where I need to allocate
          jobs to nodes with different numbers of CPU cores. For
          instance: </p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">node01: 
          Xeon 6226 32 cores</p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">node02: 
          EPYC 7543 64 cores</p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br>
        </p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit">$ </span>salloc
--partition=all --nodes=2 --nodelist=gpu01,gpu02 --ntasks-per-node=32 --comment=etc<br style="margin:0px;padding:0px">
        </p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit">If
            --ntasks-per-node is larger than</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit"> 32</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit">,
            the job could not be allocated since node01 has only</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit"> 32</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit"> cores. </span><br style="margin:0px;padding:0px">
        </p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br style="margin:0px;padding:0px">
        </p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">In
          the context of NVIDIA's HPL container, we need to pin MPI
          processes according to NUMA affinity for best performance. </p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">For
          HGX-1, there are 8 A100s having affinity with 1st, 3rd, 5th,
          and 7th NUMA domain, respectively. </p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">With
          --ntasks-per-node=32, only the first half of EPYC's NUMA
          domain is available, and we had to assign the 4-7th A100 to
          0th and 2nd NUMA domain, leading to some performance
          degradation. </p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br style="margin:0px;padding:0px">
        </p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">I
          am looking for a way to request more tasks than the number of
          physically available cores, i.e.  </p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit">$ </span><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit">salloc
            --partition=all</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"> --nodes=2</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"> --nodelist=gpu01,gpu02</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"> --ntasks-per-node=64</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"> </span><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit">--comment=etc</span></p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"><br style="margin:0px;padding:0px">
          </span></p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit">Your
            suggestions are much appreciated. </span></p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br style="margin:0px;padding:0px">
        </p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">Regards, </p>
        <p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">Viet-Duc</p>
      </div>
    </blockquote>
  </div>

</blockquote></div>