[slurm-users] Fwd: Using PreemptExemptTime

Phil Kauffman philip at kauffman.me
Wed Feb 2 19:12:17 UTC 2022


Does anyone have a working example using PreemptExemptTime?

My goal is to make a higher priority job wait 24 hours before actually 
preempting a lower priority job. Another way, any job is entitled to 24 
hours run time before being preempted. The preempted job should be 
suspended, ideally. If requeue is necessary that is ok.

It's been asked before here: 
https://groups.google.com/g/slurm-users/c/mK4_M4hpXL8/m/sRhT53VYBQAJ

I've run through many iterations attempting to set `PreemptExemptTime` 
in slurm.conf and in QOS.

Setting `PreemptType=preempt/partition_prio`:
- The preempted job gets suspended but `PreemptExemptTime` is ignored.

Setting `PreemptType=preempt/qos`
- Configuring inside the QOS as well as globally in slurm.conf
- `PreemptExemptTime` is respected but both jobs continue to run at the 
same time using 200% of the resources, which is not wanted.


Details from my test cluster below my signature. Any ideas on what I 
should check or missing? Maybe I misunderstood something.

Cheers,

Phil



In my tests I'm using 3 mins as the PreemptExemptTime.

# Nodes
NodeName=slurm[2-5] CPUs=1 Sockets=1 CoresPerSocket=1 ThreadsPerCore=2 
RealMemory=1800 MemSpecLimit=200 State=UNKNOWN



### experiment using PreemptType=preempt/qos
PartitionName=DEFAULT OverSubscribe=FORCE:1 Nodes=slurm[2-4]
PartitionName=active Default=YES QOS=normal
PartitionName=hipri  Default=NO QOS=expedite

PreemptType=preempt/qos
PreemptMode=SUSPEND,GANG
PreemptExemptTime=00:03:00
SchedulerParameters=preempt_strict_order
PriorityType=priority/multifactor
SelectType=select/cons_tres
SelectTypeParameters=CR_Core_Memory
# QOS
[root at slurm2 slurm-llnl]# sacctmgr show qos -p --noheader
normal|1|00:00:00||00:03:00|cluster|||1.000000||||||||||||||||||
expedite|2|00:00:00|normal|00:03:00|cluster|||1.000000||||||||||||||||||





### Experiment using PreemptType=preempt/partition_prio
PartitionName=low    Default=NO OverSubscribe=NO      PriorityTier=10 
PreemptMode=requeue
PartitionName=med    Default=NO OverSubscribe=FORCE:1 PriorityTier=20 
PreemptMode=suspend
PartitionName=hi     Default=NO OverSubscribe=FORCE:1 PriorityTier=30 
PreemptMode=off

PreemptType=preempt/partition_prio
PreemptMode=SUSPEND,GANG
PreemptExemptTime=00:03:00
SchedulerParameters=preempt_strict_order
PriorityType=priority/multifactor
SelectType=select/cons_tres
SelectTypeParameters=CR_Core_Memory




More information about the slurm-users mailing list