[slurm-users] GrpTRESMins and GrpTRESRaw usage

Miguel Oliveira miguel.oliveira at uc.pt
Tue Jun 28 15:23:18 UTC 2022


Hi Gérard,

The way you are checking is against the association and as such it ought to be decreasing in order to be used by fair share appropriately.
The counter used that does not decrease is on the QoS, not the association. You can check that with:

scontrol -o show assoc_mgr | grep "^QOS='+account+’”

That ought to give you two numbers. The first is the limit, or N for not limit, and the second in parenthesis the usage.

Hope that helps.

Best,

Miguel Afonso Oliveira

> On 28 Jun 2022, at 08:58, gerard.gil at cines.fr wrote:
> 
> Hi Miguel,
> 
> 
> I modified my test configuration to evaluate the effect of NoDecay.
> 
> 
> 
> 
> I modified all QOS adding NoDecay Flag.
> 
> 
> toto at login1:~/TEST$ sacctmgr show QOS
>       Name   Priority  GraceTime    Preempt   PreemptExemptTime PreemptMode                                    Flags UsageThres UsageFactor       GrpTRES   GrpTRESMins GrpTRESRunMin GrpJobs GrpSubmit     GrpWall       MaxTRES MaxTRESPerNode   MaxTRESMins     MaxWall     MaxTRESPU MaxJobsPU MaxSubmitPU     MaxTRESPA MaxJobsPA MaxSubmitPA       MinTRES 
> ---------- ---------- ---------- ---------- ------------------- ----------- ---------------------------------------- ---------- ----------- ------------- ------------- ------------- ------- --------- ----------- ------------- -------------- ------------- ----------- ------------- --------- ----------- ------------- --------- ----------- ------------- 
>     normal          0   00:00:00                                    cluster                                  NoDecay               1.000000                                                                                                                                                                                                                      
> interactif         10   00:00:00                                    cluster                                  NoDecay               1.000000       node=50                                                                 node=22                               1-00:00:00       node=50                                                                         
>      petit          4   00:00:00                                    cluster                                  NoDecay               1.000000     node=1500                                                                 node=22                               1-00:00:00      node=300                                                                         
>       gros          6   00:00:00                                    cluster                                  NoDecay               1.000000     node=2106                                                                node=700                               1-00:00:00      node=700                                                                         
>      court          8   00:00:00                                    cluster                                  NoDecay               1.000000     node=1100                                                                node=100                                 02:00:00      node=300                                                                         
>       long          4   00:00:00                                    cluster                                  NoDecay               1.000000      node=500                                                                node=200                               5-00:00:00      node=200                                                                         
>    special         10   00:00:00                                    cluster                                  NoDecay               1.000000     node=2106                                                               node=2106                               5-00:00:00     node=2106                                                                         
>    support         10   00:00:00                                    cluster                                  NoDecay               1.000000     node=2106                                                                node=700                               1-00:00:00     node=2106                                                                         
>       visu         10   00:00:00                                    cluster                                  NoDecay               1.000000        node=4                                                                node=700                                 06:00:00        node=4                       
> 
> 
> 
> I submitted a bunch of jobs to control the NoDecay efficiency and I noticed RawUsage as well as GrpTRESRaw cpu is still decreasing.
> 
> 
> toto at login1:~/TEST$ sshare -A dci -u " " -o account,user,GrpTRESRaw%80,GrpTRESMins,RawUsage
>              Account       User                                                                       GrpTRESRaw                    GrpTRESMins    RawUsage
> -------------------- ----------                            ----------------------------------------------------- ------------------------------ -----------
> dci                                cpu=6932,mem=12998963,energy=0,node=216,billing=6932,fs/disk=0,vmem=0,pages=0                      cpu=17150      415966
> toto at login1:~/TEST$ sshare -A dci -u " " -o account,user,GrpTRESRaw%80,GrpTRESMins,RawUsage
>              Account       User                                                                       GrpTRESRaw                    GrpTRESMins    RawUsage
> -------------------- ----------                            ----------------------------------------------------- ------------------------------ -----------
> dci                                cpu=6931,mem=12995835,energy=0,node=216,billing=6931,fs/disk=0,vmem=0,pages=0                      cpu=17150      415866
> toto at login1:~/TEST$ sshare -A dci -u " " -o account,user,GrpTRESRaw%80,GrpTRESMins,RawUsage
>              Account       User                                                                       GrpTRESRaw                    GrpTRESMins    RawUsage 
> -------------------- ----------                            ----------------------------------------------------- ------------------------------ ----------- 
> dci                                cpu=6929,mem=12992708,energy=0,node=216,billing=6929,fs/disk=0,vmem=0,pages=0                      cpu=17150      415766 
> 
> 
> Something I forgot to do ?
> 
> 
> Best,
> Gérard
> 
> Cordialement,
> Gérard Gil
> 
> Département Calcul Intensif
> Centre Informatique National de l'Enseignement Superieur
> 950, rue de Saint Priest
> 34097 Montpellier CEDEX 5
> FRANCE
> 
> tel :  (334) 67 14 14 14
> fax : (334) 67 52 37 63
> web : http://www.cines.fr <http://www.cines.fr/>
> 
> De: "Gérard Gil" <gerard.gil at cines.fr>
> À: "Slurm-users" <slurm-users at lists.schedmd.com>
> Cc: "slurm-users" <slurm-users at schedmd.com>
> Envoyé: Vendredi 24 Juin 2022 14:52:12
> Objet: Re: [slurm-users] GrpTRESMins and GrpTRESRaw usage
> Hi Miguel,
>  
>  Good !!
>  
>  I'll try this options on all existing QOS and see if everything works as
>  expected.
>  I'll inform you on the results.
>  
>  
>  Thanks a lot
>  
>  Best,
>  Gérard
>  
>  
>  ----- Mail original -----
> De: "Miguel Oliveira" <miguel.oliveira at uc.pt>
>  À: "Slurm-users" <slurm-users at lists.schedmd.com>
>  Cc: "slurm-users" <slurm-users at schedmd.com>
>  Envoyé: Vendredi 24 Juin 2022 14:07:16
>  Objet: Re: [slurm-users] GrpTRESMins and GrpTRESRaw usage
> 
>  
> Hi Gérard,
>  
>  I believe so. All our accounts correspond to one project and all have an
>  associated QoS with NoDecay and DenyOnLimit. This is enough to restrict usage
>  on each individual project.
>  You only need these flags on the QoS. The association will carry on as usual and
>  fairshare will not be impacted.
>  
>  Hope that helps,
>  
>  Miguel Oliveira
>  
> On 24 Jun 2022, at 12:56, gerard.gil at cines.fr wrote:
>  
>  Hi Miguel,
>  
> Why not? You can have multiple QoSs and you have other techniques to change
>  priorities according to your policies.
> 
>  
>  Is this answer my question ?
>  
>  "If all configured QOS use NoDecay, we can take advantage of the FairShare
>  priority with Decay and  all jobs GrpTRESRaw with NoDecay ?"
>  
>  Thanks
>  
>  Best,
> 
>  > > Gérard
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220628/68e42326/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4243 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220628/68e42326/attachment-0001.bin>


More information about the slurm-users mailing list