[slurm-users] Longer queuing times for larger jobs
Chris Samuel
chris at csamuel.org
Thu Feb 13 07:27:40 UTC 2020
On 5/2/20 1:44 pm, Antony Cleave wrote:
> Hi, from what you are describing it sounds like jobs are backfilling in
> front and stopping the large jobs from starting
We use a feature that SchedMD implemented for us called
"bf_min_prio_reserve" which lets you set a priority threshold below
which Slurm won't make a forward reservation for a job (and so can only
start if it can start right now without delaying other jobs).
https://slurm.schedmd.com/slurm.conf.html#OPT_bf_min_prio_reserve
So if you can arrange your local priority system so that large jobs are
over that threshold and smaller jobs are below it (or whatever suits
your use case) then you should have a way to let these large jobs get a
reliable start time without smaller jobs pushing them back in time.
There's some useful background from the bug where this was implemented:
https://bugs.schedmd.com/show_bug.cgi?id=2565
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
More information about the slurm-users
mailing list