[slurm-users] Multifactor priority configuration

Loris Bennett loris.bennett at fu-berlin.de
Wed Jan 22 14:01:15 UTC 2020


Hi,

We have

  PriorityDecayHalfLife=7-0
  PriorityMaxAge=7-0

which are the defaults.

I don't quite understand the point below about the business of the
cluster.  If the cluster isn't busy, the jobs won't need to compete and
jobs belonging to users with zero shares will still start.

For me the half-life should be related to the maximum time limit.  If I
allow jobs to run for, say, 14 days, I probably want that CPU-usage to
count against the priority for a similar period, rather than decaying
very rapidly.

Ultimately the longer the half-life, the "fairer" the priorities will
be.  However, if I have only a few serious multicore power-users, I
might want them to have a bit of an edge over hundreds of individual
users with small numbers of single-core jobs.  In that case I would
shorten the half-life.

What a good value for PriorityMaxAge is is, to my mind, even harder to
say.  The longer it is, the more you reward the time spent pending.  In
my setup it mainly helps jobs of owners who have used up all their
shares.  As well as getting back shares through the decay of CPU-usage,
which benefits all jobs, ageing benefits individual jobs.  Of course,
how much the jobs benefit depends greatly on the weight you give to
MaxAge.

Just my 2¢

Loris

Hadrian Djohari <hxd58 at case.edu> writes:

> Hi Killian,
>
> We choose to penalize the users a little only for their previous busy jobs, so we choose short turnarounds.
> PriorityDecayHalfLife=1-0
> PriorityMaxAge=4-0
>
> The busier the cluster, the longer should the parameters be, so the user previous jobs will restrict the "future" ones more.
> These should be adjusted based on the actual usage and impact to the users.
>
> Best,
> Hadrian
>
> On Wed, Jan 22, 2020 at 4:22 AM Killian Murphy <killian.murphy at york.ac.uk> wrote:
>
>  Hi all.
>
>  I’m interested to learn what people are using for the following configuration items:
>
>  * PriorityDecayHalfLife
>  * PriorityMaxAge
>
>  and why they have chosen to set these as they have. I believe we haven’t got these set quite right on our cluster (3-0 for both items), and some understanding of what other people are doing with these
>  settings might help us to get this right!
>
>  For context, ours is a tier 3 cluster servicing mixed workloads.
>
>  Thanks.
>
>  Killian
>
>  -- 
>  Killian Murphy
>  Research and High Performance Computing Team Leader
>  Research Software Engineer
>
>  Information Services & Wolfson Atmospheric Chemistry Laboratories
>  University of York
>  Heslington
>  York
>  YO10 5DD
>  +44 (0)1904 32 4753
>
>  e-mail disclaimer: http://www.york.ac.uk/docs/disclaimer/email.htm
-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin         Email loris.bennett at fu-berlin.de



More information about the slurm-users mailing list