[slurm-users] Guarantee minimum amount of GPU resources to a Slurm account

Stephan Roth stephan.roth at ee.ethz.ch
Tue Sep 12 14:14:28 UTC 2023

Dear Slurm users,

I'm looking to fulfill the requirement of guaranteeing availability of 
GPU resources to a Slurm account, while allowing this account to use 
other available GPU resources as well.

The guaranteed GPU resources should be of at least 1 type, optionally up 
to 3 types, as in:

The version of Slurm I'm using is 20.11.9.

Ideas I came up with so far:

Placing a reservation seems like the simplest solution. But this forces 
users of the account to decide whether to submit their jobs within the 
reservation or outside, based on a manual check of currently available 
GPU resources in the cluster.

Changing the partition setup by moving nodes into a new partition for 
exclusive use of the account is an overhead I'd like to avoid, as this 
is a time-limited scenario.
Even though this looks like a working solution when combined with an 
extension to the job_submit.lua prioritizing partitions for users of 
said account.

I haven't looked at QOS, yet, hoping for a short-cut from anyone who 
already has a working solution to my problem.

If you have such a solution, would you mind sharing it?


