<div dir="ltr">Hi Brian, <div><br></div><div>Thanks for suggesting this interesting feature of Slurm. <br></div><div>And sorry for the late follow up since I only had access to the cluster for a short time. </div><div><br></div><div>We were now able to perform HPL benchmark across different partitions with correct NUMA affinity. </div><div>For future reference, I put the procedure here: </div><div><br></div><div>$ salloc \</div><div> --partition=v100 --nodes=1 --ntasks-per-node=40 --gres=gpu:4 : \</div><div> --partition=a100 --nodes=1 --ntasks-per-node=64 --gres=gpu:8 <br></div><div><br></div><div>$ srun \ </div><div> -n 4 : \ </div><div> -n 8 \</div><div> hpl.sh </div><div><br></div><div>Initially we thought there would be some performance degradation when mixing partitions.</div><div>But at least for small scale test, this seems to be negligible. </div><div><br></div><div>Thanks. </div><div>Viet-Duc </div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Dec 8, 2022 at 2:27 AM Brian Andrus <<a href="mailto:toomuchit@gmail.com" target="_blank">toomuchit@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>You may want to look here:</p>
<p><a href="https://slurm.schedmd.com/heterogeneous_jobs.html" target="_blank">https://slurm.schedmd.com/heterogeneous_jobs.html</a></p>
<p>Brian Andrus<br>
</p>
<div>On 12/7/2022 12:42 AM, Le, Viet Duc
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">Dear
slurm community, </p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br style="margin:0px;padding:0px">
</p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">I
am encountering a unique situation where I need to allocate
jobs to nodes with different numbers of CPU cores. For
instance: </p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">node01:
Xeon 6226 32 cores</p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">node02:
EPYC 7543 64 cores</p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br>
</p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit">$ </span>salloc
--partition=all --nodes=2 --nodelist=gpu01,gpu02 --ntasks-per-node=32 --comment=etc<br style="margin:0px;padding:0px">
</p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit">If
--ntasks-per-node is larger than</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit"> 32</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit">,
the job could not be allocated since node01 has only</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit"> 32</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit"> cores. </span><br style="margin:0px;padding:0px">
</p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br style="margin:0px;padding:0px">
</p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">In
the context of NVIDIA's HPL container, we need to pin MPI
processes according to NUMA affinity for best performance. </p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">For
HGX-1, there are 8 A100s having affinity with 1st, 3rd, 5th,
and 7th NUMA domain, respectively. </p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">With
--ntasks-per-node=32, only the first half of EPYC's NUMA
domain is available, and we had to assign the 4-7th A100 to
0th and 2nd NUMA domain, leading to some performance
degradation. </p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br style="margin:0px;padding:0px">
</p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">I
am looking for a way to request more tasks than the number of
physically available cores, i.e. </p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span style="margin:0px;padding:0px;font-family:inherit;font-size:10pt;font-weight:inherit">$ </span><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit">salloc
--partition=all</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"> --nodes=2</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"> --nodelist=gpu01,gpu02</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"> --ntasks-per-node=64</span><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"> </span><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit">--comment=etc</span></p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit"><br style="margin:0px;padding:0px">
</span></p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><span style="margin:0px;padding:0px;font-family:inherit;font-size:13.3333px;font-weight:inherit">Your
suggestions are much appreciated. </span></p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)"><br style="margin:0px;padding:0px">
</p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">Regards, </p>
<p style="margin:0px;padding:0px;font-size:13.3333px;font-family:Tahoma;color:rgb(51,51,51)">Viet-Duc</p>
</div>
</blockquote>
</div>
</blockquote></div>