[slurm-users] Slurm very rarely assigned an estimated start time to a job

Mark Hahn hahn at mcmaster.ca
Wed Oct 2 17:34:43 UTC 2019


>(most likely in the next year). My reaction is that Slurm very rarely
>provides an estimated start time for a job. I understand that this is not
>possible for jobs on hold and dependent jobs.

it's also not possible if both running and queued jobs 
lack definite termination times; do yours?

my understanding is the following:
the main scheduler does not perform forward planning.
that is, it is opportunistic.  it walks the list of priority-sorted
pending jobs, starting any which can run on currently free
(or preemptable) resources.

the backfill scheduler is a secondary, asynchronous loop that tries hard
not to interfere with the main scheduler (severely throttles itself)
and tries to place start times for pending jobs.

the main issue with forward scheduling is that if high-prio jobs become
runnable (submitted, off hold, dependency-satisfied), then most of the 
(tentative) start times probably need to be removed.

a quick look at plugins/sched/backfill/backfill.c indicates that things 
are /complicated/ ;)

we (ComputeCanada) don't see a lot of forward start times either.

I also would welcome discussion of how to tune the backfill scheduler!
I suspect that in order to work well, it needs a particular distribution
of job priorities.

regards, mark hahn.



More information about the slurm-users mailing list