[slurm-users] Guarantee minimum amount of GPU resources to a Slurm account
Stephan Roth
stephan.roth at ee.ethz.ch
Tue Sep 12 14:14:28 UTC 2023
Dear Slurm users,
I'm looking to fulfill the requirement of guaranteeing availability of
GPU resources to a Slurm account, while allowing this account to use
other available GPU resources as well.
The guaranteed GPU resources should be of at least 1 type, optionally up
to 3 types, as in:
Gres=gpu:type_1:N,gpu:type_2:P,gpu:type_3:Q
The version of Slurm I'm using is 20.11.9.
Ideas I came up with so far:
Placing a reservation seems like the simplest solution. But this forces
users of the account to decide whether to submit their jobs within the
reservation or outside, based on a manual check of currently available
GPU resources in the cluster.
Changing the partition setup by moving nodes into a new partition for
exclusive use of the account is an overhead I'd like to avoid, as this
is a time-limited scenario.
Even though this looks like a working solution when combined with an
extension to the job_submit.lua prioritizing partitions for users of
said account.
I haven't looked at QOS, yet, hoping for a short-cut from anyone who
already has a working solution to my problem.
If you have such a solution, would you mind sharing it?
Thanks,
Stephan
More information about the slurm-users
mailing list