<html style="direction: ltr;">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <style id="bidiui-paragraph-margins" type="text/css">body p { margin-bottom: 0cm; margin-top: 0pt; } </style>
  </head>
  <body bidimailui-charset-is-forced="true" style="direction: ltr;">
    <p>Hello Anne,</p>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 01/09/2022 02:01:53, Anne Hammond
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAE-KjbMQVX1qaYKRq9w2qqeiZDKepyQntx=kUGecX=0mBZ=Z9w@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">We have a 
        <div>  CentOS 8.5 cluster </div>
        <div>  slurm 20.11</div>
        <div>  Mellanox ConnectX 6 HDR IB and Mellanox 32 port switch</div>
        <div><br>
        </div>
        <div>Our application is not scaling.  I discovered the process
          communications are going over ethernet, not ib.  I used the
          ifconfig count for the eno2 (ethernet) and ib0 (infiniband)
          interfaces at end of a job, and subtracted the count at the
          beginning.   We are using sbatch and</div>
        <div>srun {application}</div>
        <div><br>
        </div>
        <div>If I interactively login to a node and use the command</div>
        <div>mpiexec -iface ib0 -n 32 -machinefile machinefile
          {application}</div>
      </div>
    </blockquote>
    Is your application using IPoIB or RDMA?<br>
    <blockquote type="cite"
cite="mid:CAE-KjbMQVX1qaYKRq9w2qqeiZDKepyQntx=kUGecX=0mBZ=Z9w@mail.gmail.com">
      <div dir="ltr">
        <div><br>
        </div>
        <div>where machinefile contains 32 lines with the ib hostname:</div>
        <div>ne08-ib</div>
        <div>ne08-ib</div>
        <div>...</div>
        <div>ne09-ib</div>
        <div>ne09-ib</div>
        <div><br>
        </div>
        <div>the application runs over ib and scales.  </div>
        <div><br>
        </div>
        <div>/etc/slurm/slurm.conf uses the ethernet interface for
          administrative communications and allocation:</div>
        <div><br>
        </div>
        <div>
          <p class="gmail-p1"
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
              class="gmail-s1"
              style="font-variant-ligatures:no-common-ligatures">NodeName=ne[01-09]
              CPUs=32 Sockets=2 CoresPerSocket=16 ThreadsPerCore=1
              State=UNKNOWN</span></p>
          <p class="gmail-p1"
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
              class="gmail-s1"
              style="font-variant-ligatures:no-common-ligatures"><br>
            </span></p>
          <p class="gmail-p1"
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
              class="gmail-s1"
              style="font-variant-ligatures:no-common-ligatures">
            </span></p>
          <p class="gmail-p1"
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
              class="gmail-s1"
              style="font-variant-ligatures:no-common-ligatures">PartitionName=neon-noSMT
              Nodes=ne[01-09] Default=NO MaxTime=3-00:00:00
              DefaultTime=4:00:00 State=UP OverSubscribe=YES</span></p>
          <p class="gmail-p1"
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
              class="gmail-s1"
              style="font-variant-ligatures:no-common-ligatures"><br>
            </span></p>
          <p class="gmail-p1"
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
              class="gmail-s1"
              style="font-variant-ligatures:no-common-ligatures">I've
              read this is the recommended configuration.</span></p>
        </div>
        <div><br>
        </div>
        <div>I looked for srun parameters that would instruct srun to
          run over the ib interface when the job is run through the
          slurm queue.  </div>
        <div><br>
        </div>
        <div>I found the --network parameter:</div>
        <div>
          <p class="gmail-p1"
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
              class="gmail-s1"
              style="font-variant-ligatures:no-common-ligatures">srun
              --network=DEVNAME=mlx5_ib,DEVTYPE=IB</span></p>
        </div>
      </div>
    </blockquote>
    <p>What is the output of <br>
    </p>
    <p>srun --mpi=list ?<br>
    </p>
    <blockquote type="cite"
cite="mid:CAE-KjbMQVX1qaYKRq9w2qqeiZDKepyQntx=kUGecX=0mBZ=Z9w@mail.gmail.com">
      <div dir="ltr">
        <div>
          <p class="gmail-p1"
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
              class="gmail-s1"
              style="font-variant-ligatures:no-common-ligatures"><br>
            </span></p>
          <p class="gmail-p1"
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
              class="gmail-s1"
              style="font-variant-ligatures:no-common-ligatures">but
              there is not much documentation on this and I haven't been
              able to run a job yet.</span></p>
          <p class="gmail-p1"
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
              class="gmail-s1"
              style="font-variant-ligatures:no-common-ligatures"><br>
            </span></p>
          <p class="gmail-p1"
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
              class="gmail-s1"
              style="font-variant-ligatures:no-common-ligatures">Is this
              the way we should be directing srun to run the executable
              over infiniband?</span></p>
          <p class="gmail-p1"
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
              class="gmail-s1"
              style="font-variant-ligatures:no-common-ligatures"><br>
            </span></p>
          <p class="gmail-p1"
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
              class="gmail-s1"
              style="font-variant-ligatures:no-common-ligatures">Thanks
              in advance,</span></p>
          <p class="gmail-p1"
style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span
              class="gmail-s1"
              style="font-variant-ligatures:no-common-ligatures">Anne
              Hammond</span></p>
        </div>
        <div><br>
        </div>
        <div><br>
        </div>
      </div>
    </blockquote>
    <pre class="moz-signature" cols="72">-- 
Regards,
--Dani_L.</pre>
  </body>
</html>