<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>Another option might be to estimate lifetime cost (purchasing

      price + average power consumption + maintenance) of each type of

      node and then base multipliers on that. Not all workloads

      correlate well with linpack. Many teaching institutions also give

      students some amount of core hours per month to use as

      exploratory, usually trying to encourage parallelism on more than

      one node by discounting short runs that make use of multiple

      nodes. <br>

    </p>

    <div class="moz-cite-prefix">On 6/21/19 11:17 PM, Prentice Bisbal

      wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:1952e2c1-fe65-2324-dc8f-f84c6c871810@pppl.gov">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <p>In this case, I would run LINPACK on each generation of node

        (either the full node or just one core), and then somehow

        normalize performance. I  would recommend using the performance

        of a single core of the slowest node as your basis for

        normalization so it has a multiplier of 1, and then the newer

        systems would have a multiplier greater than 1. Then you can

        take that multiplier and multiply it by the number of cores in

        your different systems to get a final multiplier for a while

        node, if needed. <br>

      </p>

      <pre class="moz-signature" cols="72">Prentice </pre>

      <div class="moz-cite-prefix">On 6/19/19 3:30 PM, Fulcomer, Samuel

        wrote:<br>

      </div>

      <blockquote type="cite"

cite="mid:CAOORAuFBURpBz-rSiMpKyw8uh+=Wsbi7nsMKoZYp25z9hO0YwQ@mail.gmail.com">

        <meta http-equiv="content-type" content="text/html;

          charset=UTF-8">

        <div dir="ltr"><br>

          <div>(...and yes, the name is inspired by a certain OEM's

            software licensing schemes...)</div>

          <div><br>

          </div>

          <div>At Brown we run a ~400 node cluster containing nodes of

            multiple architectures (Sandy/Ivy, Haswell/Broadwell, and

            Sky/Cascade) purchased in some cases by University funds and

            in others by investigator funding (~50:50).  They all appear

            in the default SLURM partition. We have 3 classes of SLURM

            users:</div>

          <div><br>

          </div>

          <div>

            <ol>

              <li>Exploratory - no-charge access to up to 16 cores</li>

              <li>Priority - $750/quarter for access to up to 192 cores

                (and with a GrpTRESRunMins=cpu limit). Each user has

                their own QoS</li>

              <li>Condo - an investigator group who paid for nodes added

                to the cluster. The group has its own QoS and SLURM

                Account. The QoS allows use of the number of cores

                purchased and has a much higher priority than the QoS'

                of the "priority" users.</li>

            </ol>

            <div>The first problem with this scheme is that condo users

              who have purchased the older hardware now have access to

              the newest without penalty. In addition, we're

              encountering resistance to the idea of turning off their

              hardware and terminating their condos (despite MOUs

              stating a 5yr life). The pushback is the stated belief

              that the hardware should run until it dies.</div>

          </div>

          <div><br>

          </div>

          <div>What I propose is a new TRES called a Processor

            Performance Unit (PPU) that would be specified on the Node

            line in slurm.conf, and used such that GrpTRES=ppu=N was

            calculated as the number of allocated cores multiplied by

            their associated PPU numbers.</div>

          <div><br>

          </div>

          <div>We could then assign a base PPU to the oldest hardware,

            say, "1" for Sandy/Ivy and increase for later architectures

            based on performance improvement. We'd set the condo QoS to

            GrpTRES=ppu=N*X+M*Y,..., where N is the number of cores of

            the oldest architecture multiplied by the configured

            PPU/core, X, and repeat for any newer nodes/cores the

            investigator has purchased since.</div>

          <div><br>

          </div>

          <div>The result is that the investigator group gets to run on

            an approximation of the performance that they've purchased,

            rather on the raw purchased core count.</div>

          <div><br>

          </div>

          <div>Thoughts?</div>

          <div><br>

          </div>

          <div><br>

          </div>

        </div>

      </blockquote>

    </blockquote>

  </body>

</html>