[slurm-users] 答复: What is the 'Root/Cluster association' level in Resource Limits document mean?
taleintervenor at sjtu.edu.cn
taleintervenor at sjtu.edu.cn
Thu Feb 10 08:42:11 UTC 2022
Well, ‘sacctmgr modify cluster name=***’ is exactly what we want, and
inspired by this command, we found that ‘sacctmgr show cluster’ can
clearly list all the cluster associations.
But during test we found another problem. When limitation is defined both on
cluster level and user level, the smaller one will take effect, user
association did not take precedence of low level one. For example:
> sacctmgr show association format=cluster,account,user,grptres,qos
Cluster Account User GrpTRES QOS
---------- ---------- ---------- ------------- --------------------
sjtupi root gres/gpu=1 normal
sjtupi acct-hpc normal
sjtupi acct-hpc hpczty gres/gpu=2 normal
Cluster association defined 1-gpu limitation and User association defined
2-gpu limitation, and then 2-gpu job be blocked:
> scontrol show job 6567880
JobId=6567880 JobName=test
UserId=hpczty(3861) GroupId=hpczty(3861) MCS_label=N/A
Priority=127 Nice=0 Account=acct-hpc QOS=normal
JobState=PENDING Reason=AssocGrpGRES Dependency=(null)
Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
…
NumNodes=1-1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=1,mem=7G,node=1,billing=1,gres/gpu=2
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryCPU=7G MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
…
According to official document
https://slurm.schedmd.com/resource_limits.html , User association at
hierarchy 3 should have higher priority than Cluster association at
hierarchy 5. Is this a bug or document wrote wrong?
发件人: Paul Brunk <pbrunk at uga.edu>
发送时间: 2022年2月10日 10:28
收件人: Slurm User Community List <slurm-users at lists.schedmd.com>
主题: Re: [slurm-users] What is the 'Root/Cluster association' level in
Resource Limits document mean?
Hi:
You can use e.g. 'sacctmgr show -s users', and you'll see each user's
cluster assocation as one of the output columns. If the name were
'yourcluster', then you could do: sacctmgr modify cluster
name=yourcluster set grpTres="node=8".
==
Paul Brunk, system administrator
Georgia Advanced Resource Computing Center
Enterprise IT Svcs, the University of Georgia
On 2/8/22, 2:33 AM, "slurm-users" <slurm-users-bounces at lists.schedmd.com
<mailto:slurm-users-bounces at lists.schedmd.com> > wrote:
…[H]ow to check or modify this “cluster association”? Using command
sacctmgr show association, I can only list all users’ association.
Considering the scene in which we want to set a default node number
limitation for all users, command such as sacctmgr modify user set
grptres="node=8" do can set the limitation on all users at once, but it will
cover the original per-user limitation on some specific account. So it may
not be an satisfying solution. If the “cluster association” exists, it may
be exactly what we want. So how to set the “cluster association”?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220210/a70e98cc/attachment.htm>
More information about the slurm-users
mailing list