[slurm-users] QOS cutting off users before CPU limit is reached

Williams, Jenny Avis jennyw at email.unc.edu
Thu May 14 13:41:22 UTC 2020


Try suspending and resuming the users pending jobs to force a re-evaluation.

If the user is not in the zone of jobs that is evaluated, ie if enough higher priority jobs have dropped in ahead then this job may not have been evaluated for scheduling since a point in time when the user was indeed pending for that reason.

Jenny

From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Simon Andrews
Sent: Monday, April 27, 2020 5:58 AM
To: slurm-users at lists.schedmd.com
Subject: [slurm-users] QOS cutting off users before CPU limit is reached

I'm trying to use QoS limits to dynamically change the number of CPUs a user is allowed to use on our cluster.  As far as I can see I'm setting the appropriate GrpTRES=cpu value and I can read that back, but then jobs are being stopped before the user has reached that limit.

In squeue I see loads of lines like:

166599    normal nf-BISMARK_(288)               auser     PD       0:00      1 (QOSMaxCpuPerUserLimit)

..but if I run:

squeue -t running -p normal --format="%.12u %.2t %C "

Then the total for that user is 288 cores, but in the QoS configuration they should be allowed more.  If I run:

sacctmgr show user WithAssoc format=user%12,GrpTRES

..then I get:

    auser      cpu=512

What am I missing?  Why is 'auser' not being allowed to use all 512 of their allowed CPUs before the QOS limit is kicking in?

Thanks for any help you can offer.

Simon.

The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT Registered Charity No. 1053902.
The information transmitted in this email is directed only to the addressee. If you received this in error, please contact the sender and delete this email from your system. The contents of this e-mail are the views of the sender and do not necessarily represent the views of the Babraham Institute. Full conditions at: www.babraham.ac.uk<http://www.babraham.ac.uk/terms>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200514/5762de64/attachment.htm>


More information about the slurm-users mailing list