[slurm-users] Effect of PriorityMaxAge on job throughput

Tue Apr 16 15:43:52 UTC 2019

(sorry, kind of fell asleep on you  there...)

I wouldn't expect backfill to be a problem since it shouldn't be starting
jobs that won't complete before the priority reservations start.  We allow
jobs to go over (overtimelimit) so in our case it can be a problem.

On one of our cloud clusters we had problems with large jobs getting
starved so we set "assoc_limit_stop" in the scheduler parameters- I think
for your config it would require removing "assoc_limit_continue" (we're on
Slurm 18 and _continue is the default, replaced by _stop if you want that
behavior).  However, there we use the builtin scheduler- I'd imagine this
would play heck with a fairshare/backfill cluster (like our on-campus)
though.  However, it is designed to prevent large-job starvation.

We'd also had some issues with fairshare hitting the limit pretty quickly-
basically it stopped being a useful factor in calculating priority- so we
set FairShareDampeningFactor to 5 to get a little more utility out of that.

I'd suggest looking at the output of sprio to see how your factors are
working in situ, particularly when you've got a stuck large job.  It may be
that the SMALL_RELATIVE_TO_TIME could be washing out the job size factor if
your larger jobs are also longer.

HTH.

M

On Wed, Apr 10, 2019 at 2:46 AM David Baker <D.J.Baker at soton.ac.uk> wrote:

> Michael,
>
> Thank you for your reply and your thoughts. These are the priority weights
> that I have configured in the slurm.conf.
>
> PriorityWeightFairshare=1000000
> PriorityWeightAge=100000
> PriorityWeightPartition=1000
> PriorityWeightJobSize=10000000
> PriorityWeightQOS=10000
>
> I've made the PWJobSize to be the highest factor, however I understand
> that that only provides a once-off kick to jobs and so it probably insignificant
> in the longer run . That's followed by the PWFairshare.
>
> Should I really be looking at increasing the PWAge factor to help to "push
> jobs" through the system?
>
> The other issue that might play a part is that we see a lot of single node
> jobs (presumably backfilled) into the system. Users aren't excessively
> bombing the cluster, but maybe some backfill throttling would be useful as
> well (?)
>
> What are your thoughts having seen the priority factors, please? I've
> attached a copy of the slurm.conf just in case you or anyone else wants to
> take a more complete overview.
>
> Best regards,
> David
>
> ------------------------------
> *From:* slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of
> Michael Gutteridge <michael.gutteridge at gmail.com>
> *Sent:* 09 April 2019 18:59
> *To:* Slurm User Community List
> *Subject:* Re: [slurm-users] Effect of PriorityMaxAge on job throughput
>
>
> It might be useful to include the various priority factors you've got
> configured.  The fact that adjusting PriorityMaxAge had a dramatic effect
> suggests that the age factor is pretty high- might be worth looking at that
> value relative to the other factors.
>
> Have you looked at PriorityWeightJobSize?  Might have some utility if
> you're finding large jobs getting short-shrift.
>
>  - Michael
>
>
> On Tue, Apr 9, 2019 at 2:01 AM David Baker <D.J.Baker at soton.ac.uk> wrote:
>
> Hello,
>
> I've finally got the job throughput/turnaround to be reasonable in our
> cluster. Most of the time the job activity on the cluster sets the default
> QOS to 32 nodes (there are 464 nodes in the default queue). Jobs requesting
> nodes close to the QOS level (for example 22 nodes) are scheduled within 24
> hours which is better than it has been. Still I suspect there is room for
> improvement. I note that these large jobs still struggle to be given a
> starttime, however many jobs are now being given a starttime following my
> SchedulerParameters makeover.
>
> I used advice from the mailing list and the Slurm high throughput document
> to help me make changes to the scheduling parameters. They are now...
>
>
> SchedulerParameters=assoc_limit_continue,batch_sched_delay=20,bf_continue,bf_interval=300,bf_min_age_reserve=10800,bf_window=3600,bf_resolution=600,bf_yield_interval=1000000,partition_job_depth=500,sched_max_job_start=200,sched_min_interval=2000000
>
> Also..
> PriorityFavorSmall=NO
> PriorityFlags=SMALL_RELATIVE_TO_TIME,ACCRUE_ALWAYS,FAIR_TREE
> PriorityType=priority/multifactor
> PriorityDecayHalfLife=7-0
> PriorityMaxAge=1-0
>
> The most significant change was actually reducing "PriorityMaxAge" to 7-0
> to 1-0. Before that change the larger jobs could hang around in the queue
> for days. Does it make sense therefore to further reduce PriorityMaxAge to
> less than 1 day? Your advice would be appreciated, please.
>
> Best regards,
> David
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190416/9794ad3b/attachment-0001.html>