[slurm-users] Suspend QOS help
    Walls, Mitchell 
    miwalls at siue.edu
       
    Fri Feb 18 15:20:16 UTC 2022
    
    
  
Hello,
Hoping someone can shed some light on what is causing jobs to run on same nodes simultaneously rather than being actually suspended for the lower priority job? I can provide more info if someone can think of something to help!
# Relevant config.
PreemptType=preempt/qos
PreemptMode=SUSPEND,GANG
PartitionName=general Default=YES Nodes=general     OverSubscribe=FORCE:1 MaxTime=30-00:00:00   Qos=general  AllowQos=general
PartitionName=suspend Default=NO  Nodes=general     OverSubscribe=FORCE:1 MaxTime=30-00:00:00 Qos=suspend AllowQos=suspend
# Qoses
      Name   Priority    Preempt PreemptMode 
---------- ---------- ---------- -----------
   general       1000     suspend     cluster
   suspend       100                        cluster
# squeue (another note is I see that both processes are actually running at same time and not being timesliced in htop)
$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
             45085   general  stress.s   user2   R       7:33     2 node[04-05]
             45084   suspend stress-s  user1   R       7:40     2 node[04-05]
Thanks!
    
    
More information about the slurm-users
mailing list