Dear All,
I would like to be able to preempt (SUSPEND) a single QoS of a user that blocks the queue for several days. Currently I have about 100 users on the cluster and it seems that setting the "Preempt" option to each QoS (we have personal QoSes) is not optimal.
https://slurm.schedmd.com/sacctmgr.html#OPT_Preempt
Is there a way to to set an option to this single problematic QoS, saying that the QoS can be preempted by any other QoS? It would be much more administrator-friendly solution ;)
Kind regards
Hi Kamil,
I don't use QoS, so I don't have a direct answer to your question, however I use preemption for a queue/partition and that is extremely easy to set up and maintain. In case you plan with QoS won't work, you can set up a preemptable queue and force this user to submit only to this queue and that might be adequate for your needs.
Cheers, Davide
On Sun, Mar 30, 2025 at 7:43 AM Kamil Wilczek via slurm-users < slurm-users@lists.schedmd.com> wrote:
Dear All,
I would like to be able to preempt (SUSPEND) a single QoS of a user that blocks the queue for several days. Currently I have about 100 users on the cluster and it seems that setting the "Preempt" option to each QoS (we have personal QoSes) is not optimal.
https://slurm.schedmd.com/sacctmgr.html#OPT_Preempt
Is there a way to to set an option to this single problematic QoS, saying that the QoS can be preempted by any other QoS? It would be much more administrator-friendly solution ;)
Kind regards
Kamil Wilczek [https://keys.openpgp.org/] [D415917E84B8DA5A60E853B6E676ED061316B69B]
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com
Hello David,
thank you, this might be a simple and a viable solution to this problem. I'll test both (yours and Megan) solutions and then decide.
Kind regards
Hi Kamil,
It is possible to set all QOS's "Preempt" value with two sacctmgr commands. For example, if all the existing QOS have PreemptMode as "cluster", which seems likely if this is the first time QOS preemption is being set up, then you can set the PreemptMode of the targeted QOS to "suspend" followed by using a second sacctmgr command to add that QOS to the "Preempt" list of all the rest:
$ sacctmgr show qos format=name,preempt,preemptmode Name Preempt PreemptMode
normal cluster high cluster low cluster
$ sacctmgr update qos where name=low set PreemptMode="suspend" Modified qos... low Would you like to commit changes? (You have 30 seconds to decide) (N/y): y $ sacctmgr show qos format=name,preempt,preemptmode Name Preempt PreemptMode
normal cluster high cluster low suspend
$ sacctmgr update qos where PreemptMode=cluster set Preempt=+low Modified qos... normal high Would you like to commit changes? (You have 30 seconds to decide) (N/y): y $ sacctmgr show qos format=name,preempt,preemptmode Name Preempt PreemptMode
normal low cluster high low cluster low suspend
Regards, --Megan
Hello Megan,
this looks like a solution, thank you!
The reason I asked for an option that can be set once for only one QoS (that should be preempted by all other OoSes) is that I use Ansible for managing my users, and I have a YAML file with all users data. I was hoping to avoid adding an option to each dict and then updating all QoS individually, and yours solution certainly helps with the latter.
Kind regards