<div dir="ltr">I would like to add a preemptable queue to our cluster. Actually I already have. We simply want jobs submitted to that queue be preempted if there are no resources available for jobs in other (high priority) queues. Conceptually very simple, no conditionals, no choices, just what I wrote.<div>However it does not work as desired.<div><br>This is the relevant part:</div><div><br></div><div><span style="font-size:9pt;font-family:Arial,sans-serif;color:rgb(51,51,51)">grep
-i Preemp /opt/slurm/slurm.conf <br>
#PreemptType = preempt/partition_prio <br>PartitionName=regular DefMemPerCPU=4580 Default=True Nodes=node[01-12]
State=UP PreemptMode=off PriorityTier=200 <br>PartitionName=All DefMemPerCPU=4580 Nodes=node[01-36]
State=UP PreemptMode=off PriorityTier=500 <br>
PartitionName=lowpriority DefMemPerCPU=4580
Nodes=node[01-36] State=UP PreemptMode=cancel
PriorityTier=100 <br>
<br>
<br></span>That PreemptType setting (now commented) fully breaks slurm, everything refuses to run with errors like<span style="font-size:9pt;font-family:Arial,sans-serif;color:rgb(51,51,51)"><br>
<br>
$ squeue <br>
squeue: error: PreemptType and PreemptMode values incompatible <br>
squeue: fatal: Unable to process configuration file <br>
<br>
</span></div>If I understand correctly the documentation atĀ <a href="https://slurm.schedmd.com/preempt.html">https://slurm.schedmd.com/preempt.html</a> that is because preemption cannot cancel jobs based on partition priority, which (if true) is really unfortunate. I understand that allowing cross-partitionĀ time-slicing could be tricky and so I understand why that isn't allowed, but cancelling? Anyway, I have to questions:<div><br></div><div>1) is that correct and so should I avoid using either partition priority or cancelling?</div><div>2) is there an easy way to trick slurm into requeing and then have those jobs cancelled instead?</div><div>3) I guess the cleanest option would be to implement QoS, but I've never done it and we don't really need it for anything else other than this. The documentation looks complicated, but is it? The great Ole's website is unavailable at the moment...</div></div><div><br></div><div>Thanks!!</div></div>