[slurm-users] FairShare value is always 0.

7 Mar 2024


      Hi guys,
We've just setup our new cluster and are facing some issues regading 
fairshare calculation.
Our slurm directive regarding priority calculation are defines as follows:
PriorityType=priority/multifactor
PriorityFlags=MAX_TRES
PriorityDecayHalfLife=14-0
PriorityFavorSmall=NO
PriorityMaxAge=14-0
PriorityWeightAge=1000
PriorityWeightJobSize=1000
PriorityWeightPartition=10000000
PriorityWeightQOS=10000000
PriorityWeightTRES=CPU=2000,Mem=4000
PriorityWeightFairshare=100000
The partition we are submitinh out jobs to is setup as follows:
PartitionName=mypartPriority=1000TRESBillingWeights="CPU=1.0,Mem=0.25G"Default=YESMaxTime=96:0:0DefMemPerCPU=5333Nodes=node[001-036] 
MaxNodes=20
Whenever we take a look at the fairshare value using sshare -l we see 
the following output:
AccountUserRawSharesNormSharesRawUsageNormUsageEffectvUsageFairShareLevelFSGrpTRESMinsTRESRunMins
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
root10.0000002687245970.0000000.000000cpu=1098201,mem=5856709132,en+
rootroot10.10000000.0000000.0000000.000000cpu=0,mem=0,energy=0,node=0,b+
group110.10000000.0000000.000000cpu=0,mem=0,energy=0,node=0,b+
group210.10000000.0000000.000000cpu=0,mem=0,energy=0,node=0,b+
group310.1000002687245970.0000000.000000cpu=1098201,mem=5856709132,en+
group410.10000000.0000000.000000cpu=0,mem=0,energy=0,node=0,b+
group510.10000000.0000000.000000cpu=0,mem=0,energy=0,node=0,b+
group610.10000000.0000000.000000cpu=0,mem=0,energy=0,node=0,b+
group710.10000000.0000000.000000cpu=0,mem=0,energy=0,node=0,b+
group810.10000000.0000000.000000cpu=0,mem=0,energy=0,node=0,b+
group910.10000000.0000000.000000cpu=0,mem=0,energy=0,node=0,b+
We think it is really weird that the FairShare value is 0 for the root 
account and "NULL" for all other groups, even the one who had the 
greatest raw usage.
While taking a look at the data for our users we see the following:
AccountUserRawSharesNormSharesRawUsageEffectvUsageFairShare
-------------------------------------------------------------------------------------
root10.0000002689837210.000000
rootroot10.10000000.0000000.000000
group310.1000002689837210.000000
group3user110.090909121093740.0000000.000000
group3user210.09090900.0000000.000000
group3user310.09090900.0000000.000000
group3user410.09090900.0000000.000000
group3user510.09090900.0000000.000000
group3user610.09090900.0000000.000000
group3user710.09090900.0000000.000000
group3user810.0909092088245970.0000000.000000
group3user910.09090900.0000000.000000
group3user1010.09090900.0000000.000000
group3user1110.090909480497500.0000000.000000
group410.10000000.000000
group4user1310.0000004994520.0000000.000000
group510.10000000.000000
group5user1410.00000015396030.0000000.000000
This is a weird behavior, since user1, user8, user11, user13 and user14 
are the ones who have more RawUsage and the FairShare value is the same 
for all of them, including the users that have no yet submited any job.
We also noticed that in the slurmctld log there is the fillowing error 
message that appears with some regularity
[2024-03-07T16:38:13.260] error: _append_list_to_array: unable to append 
NULL list to assoc list.
[2024-03-07T16:38:13.260] error: _calc_tree_fs: unable to calculate 
fairshare on empty tree
The error above looks like it is coming from: 
https://github.com/SchedMD/slurm/blob/b11bf689b270f1f5dfe4b0cd54c4fa84b4af31...
Are we missing any setting on slurm.conf? This is kind of strange, 
because we have another cluster with pretty much the same configuration 
and the FairShare is calculated without any problems.
Any help would be appreciated.
-- 
Cumprimentos / Best Regards,
Zacarias Benta

LIP/INCD @ UMINHO
  ----------------------------------------------
/ Use linux, and may the source be with you.  /
----------------------------------------------
                 \  __
                 -=(o '.
                    '.-.\
                    /|  \
                    '|  ||
                     __):,_

2025

2024

[slurm-users] FairShare value is always 0.