<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>On 3/9/21 3:16 AM, Ward Poelmans wrote:<br>

    </p>

    <blockquote type="cite"

      cite="mid:01c9b606-7942-c161-8051-78308f196ff6@vub.be">

      <pre class="moz-quote-pre" wrap="">Hi Prentice,

On 8/03/2021 22:02, Prentice Bisbal wrote:

</pre>

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">I have a very hetergeneous cluster with several different generations of

AMD and Intel processors, we use this method quite effectively.

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

Could you elaborate a bit more and how you manage that? Do you force you

users to pick a feature? What if a user submits a multi node job, can

you make sure it will not start over a mix of avx512 and avx2 nodes?</pre>

    </blockquote>

    <p>I don't force the users to pick a feature, and to make matters

      worse, I think our login nodes are newer than some of the compute

      nodes, so it's entirely possible that if someone really optimizes

      their code for one of the login nodes, their job could get

      assigned to a node that doesn't understand the instruction set,

      resulting in the dreaded "Illegal Instruction" error. Suprisingly,

      this has only happened a few times in the 5 years I've been at

      this job. <br>

    </p>

    <p>I assume most users would want to use the newest and fastest

      processors if given the choice, so I set the priority weighting of

      the nodes so that the newest nodes are highest priority, and the

      oldest nodes the lowest priority. <br>

    </p>

    <p>The only way to make sure the processors stick to a certain

      instruction set, is if they specify the processor model, rather

      then than the instruction set family. For example</p>

    <p>-C 7281 will get you only AMD EPYC 7281 processors<br>

    </p>

    <p>and <br>

    </p>

    <p>-C 6376 will get you only AMD Opteron 6376 processors</p>

    <p>Using your example, if you don't want to mix AVX2 and AVX512

      processors in the same job ever, you can "lie" to Slurm in your

      topology file and come up with a topology where the two subsets of

      nodes can't talk to each other. That way, Slurm will not mix nodes

      of the different instruction sets. The problem with this is that

      it's a "permanent" solution - it's not flexible. I would imagine

      there are times when you would want to use both your AVX2 and

      AVX512 processors in a single job. <br>

    </p>

    <p>I do something like this because we have 10 nodes set aside for

      serial jobs that are connected only by 1 GbE. We obviously don't

      want internode jobs running there, so in my topology file, each of

      those nodes has it's own switch that's not connected to any other

      switch. <br>

    </p>

    <blockquote type="cite"

      cite="mid:01c9b606-7942-c161-8051-78308f196ff6@vub.be">

      <pre class="moz-quote-pre" wrap="">

</pre>

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">If you want to continue down the road you've already started on, can you

provide more information, like the partition definitions and the gres

definitions? In general, Slurm should support submitting to multiple

partitions.

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

As far as I understood it, you can give a comma separated list of

partitions to sbatch but it's not possible to this by default?</pre>

    </blockquote>

    <p><br>

    </p>

    <p>Incorrect. Giving a comma separated list is possible and is the

      default behavior for Slurm. From the sbatch documentation

      (emphasis added to the relevant sentence):<br>

    </p>

    <p>

      <blockquote type="cite">

        <dl compact="compact">

          <dt><b>-p</b>, <b>--partition</b>=<<i>partition_names</i>></dt>

          <dd>

            Request a specific partition for the resource allocation. If

            not specified,

            the default behavior is to allow the slurm controller to

            select the default

            partition as designated by the system administrator. <b>If

              the job can use more

              than one partition, specify their names in a comma

              separate list and the one

              offering earliest initiation will be used with no regard

              given to the partition

              name ordering (although higher priority partitions will be

              considered first).</b>

            When the job is initiated, the name of the partition used

            will be placed first

            in the job record partition string.</dd>

        </dl>

      </blockquote>

      But you can't have a job *span* multiple partitions, but I don't

      think that was ever your goal.</p>

    <p><br>

    </p>

    <p>Prentice<br>

    </p>

    <blockquote type="cite"

      cite="mid:01c9b606-7942-c161-8051-78308f196ff6@vub.be">

    </blockquote>

  </body>

</html>