[slurm-users] Backfill advice
djbaker12 at gmail.com
Sat Mar 23 12:06:15 UTC 2019
We do have large jobs getting starved out on our cluster, and I note
particularly that we never manage to see a job getting assigned a start
time. It seems very possible that backfilled jobs are stealing nodes
reserved for large/higher priority jobs.
I'm wondering if our backfill configuration has any bearing on this issue
or whether we are unfortunate enough to have hit a bug. One parameter that
is missing in our bf setup is "bf_continue". Is that parameter significant
in terms of ensuring that bf drills down sufficiently in the job mix? Also
we are using the default bf frequency -- should we really reduce the
frequency and potentially reduce the number of bf jobs per group/user or
total at each iteration? Currently, I think we are setting the per/user
limit to 20.
Any thoughts would be appreciated, please.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users