Hello Slurm users,
I would like to announce slurm-quota, an open-source tool providing time-based CPU and GPU quota accounting and enforcement for Slurm users and accounts.
Slurm does not currently provide native enforcement of consumption-based quotas (e.g. CPU or GPU minutes). slurm-quota addresses this by integrating with Slurm job submission and job completion mechanisms.
Key features include:
- Definition and enforcement of CPU and GPU time quotas at user and account levels - Support for heterogeneous GPU clusters using billing factors - Clear command-line reporting of usage, remaining quota, with visual progress bars - Lightweight design with minimal dependencies (Python, SQLite and few common Lua libraries)
slurm-quota is released under the MIT license and is available on GitHub, along with documentation describing architecture, deployment, and integration details.
GitHub Project: https://github.com/rackslab/slurm-quota
Comments, questions, and feedback are very welcome!
-- Rémi Palancher Rackslab: Open Source Solutions for HPC Operations https://rackslab.io
Rémi Palancher via slurm-users slurm-users@lists.schedmd.com writes:
Slurm does not currently provide native enforcement of consumption-based quotas (e.g. CPU or GPU minutes).
I don't think that is 100 % accurate. Slurm does have the GrpTRESMins specification, which can be set up users, accounts and QoS'es. This limits the number of cpu, gpu, memory or billing-minutes they can use.
There are limitations, though: One has to use the Priority Multifactor plugin, and cannot use fair share priorities (except if using GrpTRESMins on QoS'es only). Also, adjusting the used time is not possible (except setting it to 0).
That said, I think the slurm-quota tool looks interesting, especially if one uses fair share priorites, or just needs more flexibility.
One question: Does the slurm-quota tool take requeued jobs into account, so that a job that is requeued (either manually or by node failure) will not exceed the quota?
Hello Bjørn-Helge,
Le jeudi 8 janvier 2026 à 11:23, Bjørn-Helge Mevik via slurm-users slurm-users@lists.schedmd.com a écrit :
Rémi Palancher via slurm-users slurm-users@lists.schedmd.com writes:
Slurm does not currently provide native enforcement of consumption-based quotas (e.g. CPU or GPU minutes).
I don't think that is 100 % accurate. Slurm does have the GrpTRESMins specification, which can be set up users, accounts and QoS'es. This limits the number of cpu, gpu, memory or billing-minutes they can use.
Unfortunately, Slurm considers GrpTRESMins on associations against usage computed for fairshare, which is decayed with half life or reset period. This means this limit is released with time going on, unless you disable usage decaying completely.
One question: Does the slurm-quota tool take requeued jobs into account, so that a job that is requeued (either manually or by node failure) will not exceed the quota?
No, that's indeed one limitation. Slurm calls the completion plugin on requeued jobs, actual usage is well accounted then. However, Slurm does not recall the submission plugin on requeue (neither submit nor modify callbacks), therefore the quota can't be enforced for requeued jobs the way it's currently designed.
Best regards, -- Rémi Palancher Rackslab: Open Source Solutions for HPC Operations https://rackslab.io