[slurm-users] preemptable queue
Davide DelVento
davide.quantum at gmail.com
Thu Jan 11 23:01:48 UTC 2024
I would like to add a preemptable queue to our cluster. Actually I already
have. We simply want jobs submitted to that queue be preempted if there are
no resources available for jobs in other (high priority) queues.
Conceptually very simple, no conditionals, no choices, just what I wrote.
However it does not work as desired.
This is the relevant part:
grep -i Preemp /opt/slurm/slurm.conf
#PreemptType = preempt/partition_prio
PartitionName=regular DefMemPerCPU=4580 Default=True Nodes=node[01-12]
State=UP PreemptMode=off PriorityTier=200
PartitionName=All DefMemPerCPU=4580 Nodes=node[01-36] State=UP
PreemptMode=off PriorityTier=500
PartitionName=lowpriority DefMemPerCPU=4580 Nodes=node[01-36] State=UP
PreemptMode=cancel PriorityTier=100
That PreemptType setting (now commented) fully breaks slurm, everything
refuses to run with errors like
$ squeue
squeue: error: PreemptType and PreemptMode values incompatible
squeue: fatal: Unable to process configuration file
If I understand correctly the documentation at
https://slurm.schedmd.com/preempt.html that is because preemption cannot
cancel jobs based on partition priority, which (if true) is really
unfortunate. I understand that allowing cross-partition time-slicing could
be tricky and so I understand why that isn't allowed, but cancelling?
Anyway, I have to questions:
1) is that correct and so should I avoid using either partition priority or
cancelling?
2) is there an easy way to trick slurm into requeing and then have those
jobs cancelled instead?
3) I guess the cleanest option would be to implement QoS, but I've never
done it and we don't really need it for anything else other than this. The
documentation looks complicated, but is it? The great Ole's website is
unavailable at the moment...
Thanks!!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20240111/c660c1b9/attachment.htm>
More information about the slurm-users
mailing list