[slurm-users] QOS time limit tighter than partition limit
Fulcomer, Samuel
samuel_fulcomer at brown.edu
Thu Dec 16 23:12:41 UTC 2021
I've not parsed your message very far, but...
for i in `cat limit_users` ; do
sacctmgr where user=$i partition=foo account=bar set
grptresrunmins=cpu=Nlimit
On Thu, Dec 16, 2021 at 6:01 PM Ross Dickson <ross.dickson at ace-net.ca>
wrote:
> It would like to impose a time limit stricter than the partition limit on
> a certain subset of users. I should be able to do this with a QOS, but I
> can't get it to work. What am I missing?
>
> At https://slurm.schedmd.com/resource_limits.html it says,
> "Slurm's hierarchical limits are enforced in the following order ...:
>
> 1. Partition QOS limit
> 2. Job QOS limit
> 3. User association
> 4. Account association(s), ascending the hierarchy
> 5. Root/Cluster association
> 6. Partition limit
> 7. None
>
> Note: If limits are defined at multiple points in this hierarchy, the
> point in this list where the limit is first defined will be used."
>
> And there's a little more later about the Partition limit being an upper
> bound on everything.
>
> This says to me that if:
> * there is a large time limit on a partition,
> * there is a smaller time limit on the job QOS, and
> * the partition has no associated QOS,
> then the MaxWall on the Job QOS should have effect.
>
> But that's not what I observe. I've created a QOS 'nonpaying' with
> MaxWall=1-0:0:0, and set MaxTime=7-0:0:0 on partition 'general'. I set the
> association on user1 so that their job will get QOS 'nonpaying', then
> submit a job with --time=7-0:0:0, and it runs:
>
> $ scontrol show partition general | egrep 'QoS|MaxTime'
> AllocNodes=ALL Default=YES QoS=N/A
> MaxNodes=UNLIMITED MaxTime=7-00:00:00 MinNodes=0 LLN=NO
> MaxCPUsPerNode=UNLIMITED
> $ sacctmgr show qos nonpaying format=name,flags,maxwall
> Name Flags MaxWall
> ---------- -------------------- -----------
> nonpaying 1-00:00:00
> $ scontrol show job 33 | egrep 'QOS|JobState|TimeLimit'
> Priority=4294901728 Nice=0 Account=acad1 QOS=nonpaying
> JobState=RUNNING Reason=None Dependency=(null)
> RunTime=00:00:40 TimeLimit=7-00:00:00 TimeMin=N/A
> $ scontrol show config | grep AccountingStorageEnforce
> AccountingStorageEnforce = associations,limits,qos
>
> Help!?
>
> --
> Ross Dickson, Computational Research Consultant
> ACENET -- Compute Canada -- Dalhousie University
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20211216/8d472297/attachment.htm>
More information about the slurm-users
mailing list