[slurm-users] Scheduler does not reserve resources

Rémi Palancher remi at rackslab.io
Mon Jan 17 08:04:24 UTC 2022


Hi Jérémy,

Le mercredi 12 janvier 2022 à 16:59, Jérémy Lapierre <jeremy.lapierre at uni-saarland.de> a écrit :

> Hi To all slurm users,
>
> We have the following issue: jobs with highest priority are pending
> forever with "Resources" reason. More specifically, the jobs pending
> forever ask for 2 full nodes but all other jobs from other users
> (running or pending) need only a 1/4 of a node, then pending jobs asking
> for 1/4 of a node always get allocated and the jobs asking for 2 nodes
> are pending forever even though the priority is higher than the ones
> asking for less resources. I hope I'm clear enough, if not please look
> at page 17 on https://slurm.schedmd.com/SUG14/sched_tutorial.pdf, in our
> situation an infinite number of jobs will fit before what is job4 in the
> scheme p. 17 and thus job4 will never be launched.

Backfilling doesn't delay the scheduled start time of higher priority jobs,
but at least they must have a scheduled start time.

Did you check the start time of your job pending with Resources reason? eg.
with `scontrol show job <id> | grep StartTime`.

Sometimes Slurm is unable to define the start time of a pending job. One
typical reason is the absence of timelimit on the running jobs.

In t his case Slurm is unable to define when the running jobs are over,
when the next highest priority job can start and eventually unable to define
if lower priority jobs actually delay higher priority jobs.

--
Rémi Palancher
Rackslab: Open Source Solutions for HPC Operations
https://rackslab.io



More information about the slurm-users mailing list