[slurm-users] Backfill advice

Sat Mar 23 13:30:46 UTC 2019

Hello,

At first blush bf_continue and bf_interval as well as bf_maxjobs (if I
remembered the parameter correctly) are critical first steps in tuning.
Setting DebugFlags=backfill is essential to getting the needed data to make
tuning decisions.

Use of per user/account settings if they are too low can also cause
starvation depending on the way your priority calculation is set up.

I presented these slides a few years ago ag the slurm user group on this
topic:
https://slurm.schedmd.com/SLUG16/NERSC.pdf

The key thing to keep in mind with large jobs is that slurm needs to
evaluate them again and again in the same order or the scheduled time may
drift.  Thus it is important that once jobs are getting planning
reservations they must continue to do so.

Because of the prevalence of large jobs at our site we use
 bf_min_prio_resv which splits the priority space into a reserving and
non-reserving set, and then use job age to allow jobs to age from the non
reserving portion of the priority space to the reservation portion.  Use of
the recent MaxJobsAccruePerUser limits on a job qos can throttle the rate
of jobs aging and prevent negative effects from users submitting large
numbers of jobs.

I realize that is a large number of tunables and concepts densely packed,
but it should give you some reasonable starting points.

Doug

On Sat, Mar 23, 2019 at 05:26 david baker <djbaker12 at gmail.com> wrote:

> Hello,
>
> We do have large jobs getting starved out on our cluster, and I note
> particularly that we never manage to see a job getting assigned a start
> time. It seems very possible that backfilled jobs are stealing nodes
> reserved for large/higher priority jobs.
>
> I'm wondering if our backfill configuration has any bearing on this issue
> or whether we are unfortunate enough to have hit a bug. One parameter that
> is missing in our bf setup is "bf_continue". Is that parameter significant
> in terms of ensuring that bf drills down sufficiently in the job mix? Also
> we are using the default bf frequency -- should we really reduce the
> frequency and potentially reduce the number of bf jobs per group/user or
> total at each iteration? Currently, I think we are setting the per/user
> limit to 20.
>
> Any thoughts would be appreciated, please.
>
> Best regards,
> David
>
-- 
Sent from Gmail Mobile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190323/aad6fe6b/attachment.html>