[slurm-users] Overzealous PartitionQoS Limits

Wed May 20 10:00:31 UTC 2020

Dear all,

we set up a floating partition as described in SLURM's QoS documentation 
to allow for jobs with a longer than usual walltime on a part of our 
cluster: QoS with GrpCPUs and GrpNodes limits attached to the 
longer-walltime partition which contains all nodes.

We observe that jobs are stuck in the queue like:

$ squeue -o "%.7i %.9P %.2t %.6C %.20S %R"
   JOBID PARTITION ST   CPUS           START_TIME NODELIST(REASON)
1108810      long PD      2                  N/A (QOSGrpNodeLimit)
1108811      long PD      2                  N/A (QOSGrpNodeLimit)
1108812      long PD      2                  N/A (QOSGrpNodeLimit)
1108813      long PD      2                  N/A (QOSGrpNodeLimit)
1108814      long PD      2                  N/A (QOSGrpNodeLimit)
1108815      long PD      2                  N/A (QOSGrpNodeLimit)
1108816      long PD      2                  N/A (QOSGrpNodeLimit)
1108817      long PD      2                  N/A (QOSGrpNodeLimit)
1108818      long PD      2                  N/A (QOSGrpNodeLimit)
[...]

However, we are not even close to any of the GrpNodes or GrpCPUs limits.
And there are nodes in MIXED state that should have slots for two-CPU 
jobs available.
The mentioned jobs even have the highest priority (except for two jobs 
on a special-hardware partition), and they have an empty "Dependency=" 
field.

It seems that those jobs are occasionally assigned a start time when the 
scheduler runs, but that is quickly reverted to "N/A".

Did any of you observe this or similar behaviour?
FWIW, we are running SLURM 17.11 on Debian, an upgrade to 19.05 is 
scheduled in the next couple of weeks.

Best,
Christoph

-- 
Dr. Christoph Brüning
Universität Würzburg
Rechenzentrum
Am Hubland
D-97074 Würzburg
Tel.: +49 931 31-80499