[slurm-users] Fwd: Using PreemptExemptTime

Phil Kauffman philip at kauffman.me
Thu Feb 3 22:12:05 UTC 2022


 > I know you want to suspend preempted jobs, but what happens if you
 > cancel them instead?

Thanks John. Your response definitely helped me. I have done as you 
suggested and tested CANCEL which worked.


For John and everyone else: below are the results of my tests. My 
apologies for the wall of text.

In my testing I believe I have only further confirmed that there is a 
difference between what the man page says should work and what actually 
happens when attempting to use SUSPEND,GANG using PreemptType qos or 
partition_prio.


I've verified using 'preempt/qos' that using CANCEL or REQUEUE and 
launching jobs on the same partition works as you say and the man page 
describes.

Below are my tests:

For all tests the below was configured:
# sacctmgr show qos format=name,priority,preempt -p
Name|Priority|Preempt|
normal|1||
expedite|2|normal|

QOS `expedite` can preempt QOS `normal`.


Test 1: preempt/qos, CANCEL

slurm.conf:
   PreemptType: preempt/qos
   PreemptMode: 'CANCEL' # requeue works with this option as well. 

   PreemptExemptTime: '00:00:00'

   PartitionName=DEFAULT OverSubscribe=FORCE:1 Nodes=slurm[2-4]
   PartitionName=active Default=YES QOS=normal
   PartitionName=hipri  Default=NO  QOS=expedite


sacctmgr -i modify qos where name=normal set PreemptExemptTime=00:03:00 
PreemptMode=CANCEL
sacctmgr -i modify qos where name=expedite set PreemptExemptTime=-1 
PreemptMode=OFF



Result: PASS
'normal' QOS job gets canceled and 'expedite' job starts after waiting 
for PreemptExemptTime.


Test 2: preempt/qos, REQUEUE

slurm.conf:
   PreemptType: preempt/qos
   PreemptMode: 'CANCEL' # requeue works with this option as well. 

   PreemptExemptTime: '00:00:00'

   PartitionName=DEFAULT OverSubscribe=FORCE:1 Nodes=slurm[2-4]
   PartitionName=active Default=YES QOS=normal
   PartitionName=hipri  Default=NO  QOS=expedite

QOS:
   sacctmgr -i modify qos where name=normal set 
PreemptExemptTime=00:03:00 PreemptMode=REQUEUE
   sacctmgr -i modify qos where name=expedite set PreemptExemptTime=-1 
PreemptMode=OFF


Result: PASS
'normal' QOS job gets requeued and 'expedite' job starts after waiting 
for PreemptExemptTime.



Test 3: preempt/qos, SUSPEND,GANG

slurm.conf
   PreemptType: preempt/qos
   PreemptMode: 'SUSPEND,GANG'
   PreemptExemptTime: '00:00:00'

   PartitionName=DEFAULT OverSubscribe=FORCE:1 Nodes=slurm[2-4]
   PartitionName=active Default=YES QOS=normal
   PartitionName=hipri  Default=NO  QOS=expedite

QOS:
   sacctmgr -i modify qos where name=normal set 
PreemptExemptTime=00:03:00 PreemptMode=SUSPEND
   sacctmgr -i modify qos where name=expedite set PreemptExemptTime=-1 
PreemptMode=OFF

This page: https://slurm.schedmd.com/preempt.html
PreemptMode > SUSPEND > NOTE


"If PreemptType=preempt/qos is configured and if the preempted job(s) 
and the preemptor job from are on the same partition, then they will 
share resources with the Gang scheduler (time-slicing)."

Result for same partition: PASS
Submitting on the same partition with a different QOS enables the jobs 
share time on the same resource.


Now getting to the function I wanted:

"If not (i.e. if the preemptees and preemptor are on different 
partitions) then the preempted jobs will remain suspended until the 
preemptor ends."

Result for submitting on a different and overlapping partitions: FAIL

Submitting 'normal' QOS level jobs and then one 'expedited' job from 
another user results in both jobs running on the same node. No 
suspending, requeue, or cancel has occurred. This is not wanted, 
probably ever.

The desired behavior is to suspend the job and is what is described in 
the man page, however I don't see that occurring.


Test 4: preempt/partition_prio, SUSPEND,GANG

slurm.conf
   PreemptType: preempt/partition_prio
   PreemptMode: 'SUSPEND,GANG'
   PreemptExemptTime: '00:03:00'

   PartitionName=active OverSubscribe=FORCE:1 PriorityTier=1 
PreemptMode=suspend
   PartitionName=hipri OverSubscribe=FORCE:1 PriorityTier=2 PreemptMode=off

Result: FAIL
User A's job gets preempted by user B and gets suspended, which is 
desired, however PreemptExemptTime is not respected and the job is 
preempted immediately.


I see the following possibilities:

a. The man page does *not* accurately describe the function or my 
interpretation was incorrect.
b. I have something misconfigured.
c. I have found a bug.

Cheers,

Phil



More information about the slurm-users mailing list