[slurm-users] Is it possible to set a default QOS per partition?

Stack Korora stackkorora at disroot.org
Mon Mar 1 21:24:46 UTC 2021


Greetings,

We have different node classes that we've set up in different 
partitions. For example, we have our standard compute nodes in compute; 
our GPU's in a gpu partition; and jobs that need to run for months go 
into a long partition with a different set of machines.

For each partition, we have QOS to prevent any single user from 
dominating the resources (set at a max of 60% of resources; not my call 
- it's politics - I'm not going down that rabbit hole...).

Thus, I've got something like this in my slurm.conf (abbreviating to 
save space; sorry if I trim too much).

PartitionName=compute [snip] AllowQOS=compute Default=YES
PartitionName=gpu [snip] AllowQOS=gpu Default=NO
PartitionName=long [snip] AllowQOS=long Default=NO

Then I have my QOS configured. And in my `sacctmgr dump cluster | grep 
DefaultQOS` I have "DefaultQOS=compute".

All of that works exactly as expected.

This makes it easy/nice for my users to just do something like:
$ sbatch -n1 -N1 -p compute script.sh

They don't have to specify the QOS for compute and they like this.

However, for the other partitions they have to do something like this:
$ sbatch -n1 -N1 -p long --qos=long script.sh

The users don't like this. (though with scripts, I don't see the big 
deal in just adding a new line...but you know... users...)

The request from the users is to make a default QOS for each partition 
thus not needing to specify the QOS for the other partitions.

Because the default is set in the cluster configuration, I'm not sure 
how to do this. And I'm not seeing anything in the documentation for a 
scenario like this.

Question A:
Anyone know how I can set a default QOS per partition?

Question B:
Chesterton's fence and all... Is there a better way to accomplish what 
we are attempting to do? I don't want a single QOS to limit across all 
partitions. I need a per partition limit that restricts users to 60% of 
resources in that partition.

Thank you!
~Stack~



More information about the slurm-users mailing list