<div dir="ltr"><div dir="ltr"><div>Janne, thankyou. That FGCI benchmark in a container is pretty smart.</div><div>I always say that real application benchmarks beat synthetic benchmarks.</div><div>Taking a small mix of applications like that and taking a geometric mean is great.</div><div><br></div><div>Note: <span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;font-size:16px;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)"> <font size="2"><i>"a reference result run on a Dell PowerEdge C4130"</i></font></span></div><div><span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)">In the old days CERN had a standard unit of compute, which was equivalent to a VAX.</span></div><div><span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)">I am sure that unit has long been retired.</span></div><div><span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)">Though I must say that having participated in CERN tenders a few years ago they use</span></div><div><span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)">SpecFP measurements to compare systems.</span></div><div><span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)"><br></span></div><div><span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)"><br></span></div><div><span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)"><br></span></div><div><span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)"><br></span></div><div><span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)"><br></span></div><div><span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)"><br></span></div><div><b></b><i></i><u></u><sub></sub><sup></sup><strike></strike><br></div><div><br></div><div><br></div></div></div><br><div class="gmail_quote"><div class="gmail_attr" dir="ltr">On Thu, 20 Jun 2019 at 07:41, Janne Blomqvist <<a href="mailto:janne.blomqvist@aalto.fi">janne.blomqvist@aalto.fi</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left-color:rgb(204,204,204);border-left-width:1px;border-left-style:solid">On 19/06/2019 22.30, Fulcomer, Samuel wrote:<br>

> <br>

> (...and yes, the name is inspired by a certain OEM's software licensing<br>

> schemes...)<br>

> <br>

> At Brown we run a ~400 node cluster containing nodes of multiple<br>

> architectures (Sandy/Ivy, Haswell/Broadwell, and Sky/Cascade) purchased<br>

> in some cases by University funds and in others by investigator funding<br>

> (~50:50).  They all appear in the default SLURM partition. We have 3<br>

> classes of SLURM users:<br>

> <br>

>  1. Exploratory - no-charge access to up to 16 cores<br>

>  2. Priority - $750/quarter for access to up to 192 cores (and with a<br>

>     GrpTRESRunMins=cpu limit). Each user has their own QoS<br>

>  3. Condo - an investigator group who paid for nodes added to the<br>

>     cluster. The group has its own QoS and SLURM Account. The QoS allows<br>

>     use of the number of cores purchased and has a much higher priority<br>

>     than the QoS' of the "priority" users.<br>

> <br>

> The first problem with this scheme is that condo users who have<br>

> purchased the older hardware now have access to the newest without<br>

> penalty. In addition, we're encountering resistance to the idea of<br>

> turning off their hardware and terminating their condos (despite MOUs<br>

> stating a 5yr life). The pushback is the stated belief that the hardware<br>

> should run until it dies.<br>

> <br>

> What I propose is a new TRES called a Processor Performance Unit (PPU)<br>

> that would be specified on the Node line in slurm.conf, and used such<br>

> that GrpTRES=ppu=N was calculated as the number of allocated cores<br>

> multiplied by their associated PPU numbers.<br>

> <br>

> We could then assign a base PPU to the oldest hardware, say, "1" for<br>

> Sandy/Ivy and increase for later architectures based on performance<br>

> improvement. We'd set the condo QoS to GrpTRES=ppu=N*X+M*Y,..., where N<br>

> is the number of cores of the oldest architecture multiplied by the<br>

> configured PPU/core, X, and repeat for any newer nodes/cores the<br>

> investigator has purchased since.<br>

> <br>

> The result is that the investigator group gets to run on an<br>

> approximation of the performance that they've purchased, rather on the<br>

> raw purchased core count.<br>

> <br>

> Thoughts?<br>

> <br>

> <br>

<br>

What we do is that our nodes are grouped into separate partitions based<br>

on the CPU model. E.g. the partition "batch-skl" is where our Skylake<br>

(6148) nodes are. The we have a job_submit.lua script which sends jobs<br>

without an explicit partition spec to all batch-xxx partitions (checking<br>

constraints etc. along the way). Then for each partition we set<br>

TRESBillingWeights= to "normalize" the fairshare consumption based on<br>

the geometric mean of a set of hopefully not too unrepresentative<br>

single-node benchmarks [1].<br>

<br>

We also set a memory billing weight, and have MAX_TRES among our<br>

PriorityFlags, approximating dominant resource fairness (DRF) [2]<br>

<br>

[1] <a href="https://github.com/AaltoScienceIT/docker-fgci-benchmark" target="_blank" rel="noreferrer">https://github.com/AaltoScienceIT/docker-fgci-benchmark</a><br>

<br>

[2] <a href="https://people.eecs.berkeley.edu/~alig/papers/drf.pdf" target="_blank" rel="noreferrer">https://people.eecs.berkeley.edu/~alig/papers/drf.pdf</a><br>

<br>

-- <br>

Janne Blomqvist, D.Sc. (Tech.), Scientific Computing Specialist<br>

Aalto University School of Science, PHYS & NBE<br>

+358503841576 || <a href="mailto:janne.blomqvist@aalto.fi" target="_blank">janne.blomqvist@aalto.fi</a><br>

<br>

</blockquote></div>