Hello,

We have a new cluster and I'm trying to setup fairshare accounting.  I'm trying to track CPU, MEM and GPU.  It seems that billing for individual jobs is correct, but billing isn't being accumulated (TRESRunMin is always 0).

In my slurm.conf, I think the relevant lines are

AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageTRES=gres/gpu

PriorityFlags=MAX_TRES

PartitionName=gpu Nodes=node[1-7] MaxCPUsPerNode=384 MaxTime=7-0:00:00 State=UP TRESBillingWeights="CPU=1.0,MEM=0.125G,GRES/gpu=9.6"
PartitionName=cpu Nodes=node[1-7] MaxCPUsPerNode=182 MaxTime=7-0:00:00 State=UP TRESBillingWeights="CPU=1.0,MEM=0.125G,GRES/gpu=9.6"

I currently have one recently finished job and one running job.  sacct gives

$ sacct --format=JobID,JobName,ReqTRES%50,AllocTRES%50,TRESUsageInAve%50,TRESUsageInMax%50
JobID           JobName                                            ReqTRES                                          AllocTRES                                     TRESUsageInAve                                     TRESUsageInMax
------------ ---------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------------------------------------------------
154          interacti+           billing=9,cpu=1,gres/gpu=1,mem=1G,node=1           billing=9,cpu=2,gres/gpu=1,mem=2G,node=1
154.interac+ interacti+                                                                        cpu=2,gres/gpu=1,mem=2G,node=1 cpu=00:00:00,energy=0,fs/disk=2480503,mem=3M,page+ cpu=00:00:00,energy=0,fs/disk=2480503,mem=3M,page+
155          interacti+           billing=9,cpu=1,gres/gpu=1,mem=1G,node=1           billing=9,cpu=2,gres/gpu=1,mem=2G,node=1
155.interac+ interacti+                                                                        cpu=2,gres/gpu=1,mem=2G,node=1

billing=9 seems correct to me, since I have 1 GPU allocated, which has the largest score of 9.6.  However, sshare doesn't show anything in TRESRunMins

sshare --format=Account,User,RawShares,FairShare,RawUsage,EffectvUsage,TRESRunMins%110
Account                    User  RawShares  FairShare    RawUsage  EffectvUsage                                                                                                    TRESRunMins
-------------------- ---------- ---------- ---------- ----------- ------------- --------------------------------------------------------------------------------------------------------------
root                                                     21589714      1.000000         cpu=0,mem=0,energy=0,node=0,billing=0,fs/disk=0,vmem=0,pages=0,gres/gpu=0,gres/gpumem=0,gres/gpuutil=0
 abrol_group                          2000                      0      0.000000         cpu=0,mem=0,energy=0,node=0,billing=0,fs/disk=0,vmem=0,pages=0,gres/gpu=0,gres/gpumem=0,gres/gpuutil=0
 luchko_group                         2000               21589714      1.000000         cpu=0,mem=0,energy=0,node=0,billing=0,fs/disk=0,vmem=0,pages=0,gres/gpu=0,gres/gpumem=0,gres/gpuutil=0
  luchko_group          tluchko          1   0.333333    21589714      1.000000         cpu=0,mem=0,energy=0,node=0,billing=0,fs/disk=0,vmem=0,pages=0,gres/gpu=0,gres/gpumem=0,gres/gpuutil=0

Why is TRESRunMin all 0 but RawUsage is not for tluchko? I have checked and slurmdbd is running.

Thank you,

Tyler
Sent with Proton Mail secure email.