[slurm-users] Suspend QOS help
Walls, Mitchell
miwalls at siue.edu
Fri Feb 18 15:20:16 UTC 2022
Hello,
Hoping someone can shed some light on what is causing jobs to run on same nodes simultaneously rather than being actually suspended for the lower priority job? I can provide more info if someone can think of something to help!
# Relevant config.
PreemptType=preempt/qos
PreemptMode=SUSPEND,GANG
PartitionName=general Default=YES Nodes=general OverSubscribe=FORCE:1 MaxTime=30-00:00:00 Qos=general AllowQos=general
PartitionName=suspend Default=NO Nodes=general OverSubscribe=FORCE:1 MaxTime=30-00:00:00 Qos=suspend AllowQos=suspend
# Qoses
Name Priority Preempt PreemptMode
---------- ---------- ---------- -----------
general 1000 suspend cluster
suspend 100 cluster
# squeue (another note is I see that both processes are actually running at same time and not being timesliced in htop)
$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
45085 general stress.s user2 R 7:33 2 node[04-05]
45084 suspend stress-s user1 R 7:40 2 node[04-05]
Thanks!
More information about the slurm-users
mailing list