[slurm-users] QOS cutting off users before CPU limit is reached
greg.wickham at kaust.edu.sa
Mon May 18 09:19:44 UTC 2020
Something to try . .
If you restart “slurmctld” does the new QOS apply?
We had a situation where slurmdbd was running as a different user than slurmctld and hence sacctmgr changes weren’t being reflected in slurmctld.
On 27 Apr 2020, at 12:57, Simon Andrews <simon.andrews at babraham.ac.uk<mailto:simon.andrews at babraham.ac.uk>> wrote:
I’m trying to use QoS limits to dynamically change the number of CPUs a user is allowed to use on our cluster. As far as I can see I’m setting the appropriate GrpTRES=cpu value and I can read that back, but then jobs are being stopped before the user has reached that limit.
In squeue I see loads of lines like:
166599 normal nf-BISMARK_(288) auser PD 0:00 1 (QOSMaxCpuPerUserLimit)
..but if I run:
squeue -t running -p normal --format="%.12u %.2t %C "
Then the total for that user is 288 cores, but in the QoS configuration they should be allowed more. If I run:
sacctmgr show user WithAssoc format=user%12,GrpTRES
..then I get:
What am I missing? Why is ‘auser’ not being allowed to use all 512 of their allowed CPUs before the QOS limit is kicking in?
Thanks for any help you can offer.
The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT Registered Charity No. 1053902.
The information transmitted in this email is directed only to the addressee. If you received this in error, please contact the sender and delete this email from your system. The contents of this e-mail are the views of the sender and do not necessarily represent the views of the Babraham Institute. Full conditions at: www.babraham.ac.uk<http://www.babraham.ac.uk/terms>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users