[slurm-users] clarification on slurm scheduler and the "nice" parameter

Matteo F mfasco984 at gmail.com
Tue Apr 14 07:00:19 UTC 2020


Hello there,
I am having problems understanding the slurm scheduler, with regard to the
"nice" parameter.

I have two types of job: one is low priority and uses 4 CPUs (--nice=20),
the other one is high priority and uses 24 CPUs (--nice=10).
When I submit, let's say, 50 low-priority jobs, only 6 are executed - this
is fine since a job uses 4 CPUs and the node has 24.
However, when I submit my high priority job that must use 24 CPUs, things
get strange.

What I was expecting:
- slurm would have stopped starting low-priority queued jobs (switching
from PD -> R)
- waited to have 24 CPUs free (in this case, to have no running jobs)
- run the high priority job
- when the job has completed, start the low priority jobs as usual

What I instead observed:
- slurm keep starting queue job like I didn't specified a nice parameter.


(partial) slurm config:
SwitchType=switch/none
TaskPlugin=task/none
FastSchedule=1
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory
NodeName=node01 CPUs=24 RealMemory=120000 Sockets=2 CoresPerSocket=6
ThreadsPerCore=2   State=UNKNOWN

Low priority job:
#SBATCH --job-name=task4
#SBATCH --ntasks=4
#SBATCH --mem=1gb
#SBATCH --time=10:00:00
#SBATCH --output=%j.out
#SBATCH --error=%j.err
#SBATCH --partition=ogre
#SBATCH --account=ogre
#SBATCH --nice=20

High priority job:
#SBATCH --job-name=task24
#SBATCH --ntasks=24
#SBATCH --mem=1gb
#SBATCH --time=10:00:00
#SBATCH --output=%j.out
#SBATCH --error=%j.err
#SBATCH --partition=ogre
#SBATCH --account=ogre
#SBATCH --nice=10

Do you have any idea of what I am missing?

Thanks a lot.
Matteo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200414/592306ea/attachment.htm>


More information about the slurm-users mailing list