[slurm-users] preemptable queue

Paul Edmon pedmon at cfa.harvard.edu
Fri Jan 12 14:20:10 UTC 2024


At least in the example you are showing you have PreemptType commented 
out, which means it will return the default. PreemptMode Cancel should 
work, I don't see anything in the documentation that indicates it 
wouldn't.  So I suspect you have a typo somewhere in your conf.

-Paul Edmon-

On 1/11/2024 6:01 PM, Davide DelVento wrote:
> I would like to add a preemptable queue to our cluster. Actually I 
> already have. We simply want jobs submitted to that queue be preempted 
> if there are no resources available for jobs in other (high priority) 
> queues. Conceptually very simple, no conditionals, no choices, just 
> what I wrote.
> However it does not work as desired.
>
> This is the relevant part:
>
> grep -i Preemp /opt/slurm/slurm.conf
> #PreemptType = preempt/partition_prio
> PartitionName=regular DefMemPerCPU=4580 Default=True Nodes=node[01-12] 
> State=UP PreemptMode=off PriorityTier=200
> PartitionName=All DefMemPerCPU=4580 Nodes=node[01-36] State=UP 
> PreemptMode=off PriorityTier=500
> PartitionName=lowpriority DefMemPerCPU=4580 Nodes=node[01-36] State=UP 
> PreemptMode=cancel PriorityTier=100
>
>
> That PreemptType setting (now commented) fully breaks slurm, 
> everything refuses to run with errors like
>
> $ squeue
> squeue: error: PreemptType and PreemptMode values incompatible
> squeue: fatal: Unable to process configuration file
>
> If I understand correctly the documentation at 
> https://slurm.schedmd.com/preempt.html that is because preemption 
> cannot cancel jobs based on partition priority, which (if true) is 
> really unfortunate. I understand that allowing 
> cross-partition time-slicing could be tricky and so I understand why 
> that isn't allowed, but cancelling? Anyway, I have to questions:
>
> 1) is that correct and so should I avoid using either partition 
> priority or cancelling?
> 2) is there an easy way to trick slurm into requeing and then have 
> those jobs cancelled instead?
> 3) I guess the cleanest option would be to implement QoS, but I've 
> never done it and we don't really need it for anything else other than 
> this. The documentation looks complicated, but is it? The great Ole's 
> website is unavailable at the moment...
>
> Thanks!!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20240112/2d2c1455/attachment.htm>


More information about the slurm-users mailing list