[slurm-users] QOS time limit tighter than partition limit

Fulcomer, Samuel samuel_fulcomer at brown.edu
Thu Dec 16 23:15:04 UTC 2021


...and you shouldn't be able to do this with a QoS (I think as you want it
to), as "grptresrunmins" applies to the aggregate of everything using the
QoS.

On Thu, Dec 16, 2021 at 6:12 PM Fulcomer, Samuel <samuel_fulcomer at brown.edu>
wrote:

> I've not parsed your message very far, but...
>
> for i in `cat limit_users` ; do
>
> sacctmgr where user=$i partition=foo account=bar set
> grptresrunmins=cpu=Nlimit
>
> On Thu, Dec 16, 2021 at 6:01 PM Ross Dickson <ross.dickson at ace-net.ca>
> wrote:
>
>> It would like to impose a time limit stricter than the partition limit on
>> a certain subset of users.  I should be able to do this with a QOS, but I
>> can't get it to work.  What am I missing?
>>
>> At https://slurm.schedmd.com/resource_limits.html it says,
>> "Slurm's hierarchical limits are enforced in the following order ...:
>>
>> 1. Partition QOS limit
>> 2. Job QOS limit
>> 3. User association
>> 4. Account association(s), ascending the hierarchy
>> 5. Root/Cluster association
>> 6. Partition limit
>> 7. None
>>
>> Note: If limits are defined at multiple points in this hierarchy, the
>> point in this list where the limit is first defined will be used."
>>
>> And there's a little more later about the Partition limit being an upper
>> bound on everything.
>>
>> This says to me that if:
>> * there is a large time limit on a partition,
>> * there is a smaller time limit on the job QOS, and
>> * the partition has no associated QOS,
>> then the MaxWall on the Job QOS should have effect.
>>
>> But that's not what I observe.  I've created a QOS 'nonpaying' with
>> MaxWall=1-0:0:0, and set MaxTime=7-0:0:0 on partition 'general'.  I set the
>> association on  user1 so that their job will get QOS 'nonpaying', then
>> submit a job with --time=7-0:0:0, and it runs:
>>
>> $ scontrol show partition general | egrep 'QoS|MaxTime'
>>    AllocNodes=ALL Default=YES QoS=N/A
>>    MaxNodes=UNLIMITED MaxTime=7-00:00:00 MinNodes=0 LLN=NO
>> MaxCPUsPerNode=UNLIMITED
>> $ sacctmgr show qos nonpaying format=name,flags,maxwall
>>       Name                Flags     MaxWall
>> ---------- -------------------- -----------
>>  nonpaying                       1-00:00:00
>> $ scontrol show job 33 | egrep 'QOS|JobState|TimeLimit'
>>    Priority=4294901728 Nice=0 Account=acad1 QOS=nonpaying
>>    JobState=RUNNING Reason=None Dependency=(null)
>>    RunTime=00:00:40 TimeLimit=7-00:00:00 TimeMin=N/A
>> $ scontrol show config | grep AccountingStorageEnforce
>> AccountingStorageEnforce = associations,limits,qos
>>
>> Help!?
>>
>> --
>> Ross Dickson, Computational Research Consultant
>> ACENET  --   Compute Canada  --  Dalhousie University
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20211216/aa47a3a9/attachment.htm>


More information about the slurm-users mailing list