[slurm-users] Understanding fairshare factor
Ewan Roche
ewan.roche at unil.ch
Fri Jan 14 15:14:01 UTC 2022
Hello Michał,
the behaviour is what I’d expect from the fair-tree algorithm which is based on a binary search. Fair-tree has been the default for the past few Slurm releases.
Here the algorithm has to decide which of sgflab or faculty (they’re on the same level in the hierarchy) has the higher priority so goes to the front of the queue - it’s more or less 1 or 0 (numerically 0.9 vs 0.2 in your results)
When sgflab is at 0.9 it means that their accumulated usage taking into account the decay is less than that of all the faculty users and when they have used more than all the faculty it flips to 0.2.
If you were looking at the level fair-share then there might be a more gradual change but ultimately it’s a question of, at any one time, who has a higher priority. With only two accounts at this level in competition the result will always look extreme as it flips between who is ahead and who is behind.
The original presentation about fair-tree from Ryan Cox and Levi Morrison is well worth reading and is at
https://slurm.schedmd.com/SUG14/fair_tree.pdf
Ewan Roche
Division Calcul et Soutien à la Recherche
UNIL | Université de Lausanne
> On 12 Jan 2022, at 12:01, Michał Kadlof <m.kadlof at mini.pw.edu.pl> wrote:
>
> Hello,
>
> I'm trying to understand behavior of fairshare factor. I set a munin monitoring for several accounts and observe the changes in time, and they're not clear for me.
>
> A background:
> My users are split into two groups: sfglab and faculty,
> in sfglab every one are equal, and in faculty they are additionally split into project accounts i which they they are also equal.
>
> for example:
> root
> sfglab
> sfglab_user_1
> sfglab_user_2
> ...
> faculty
> project_1
> faculty_user_1
> faculty_user_2
> project_2
> faculty_user_1
> faculty_user_3
> ...
>
> I do have 2 particularly active users that run a large jobs in sfglab, and activity of faculty users is very variable. From very active to dead souls.
>
> Now this is fairshare factor for last week:
>
>
>
> Image also available on-line:
> https://i.imgur.com/2sfOUFn.png
>
>
> What I would expect there should be rather smooth changes instead of those high point changes.
> I would also expect to see some small constant changes from PriorityDecayHalfLife, which is set for two weeks and recalculated every 5 minutes.
>
> It would be great if someone could comment on that, before my users will start comply on low priority of their jobs.
>
> Here is my Priority config.
>
> PriorityParameters = (null)
> PrioritySiteFactorParameters = (null)
> PrioritySiteFactorPlugin = (null)
> PriorityDecayHalfLife = 14-00:00:00
> PriorityCalcPeriod = 00:05:00
> PriorityFavorSmall = No
> PriorityFlags =
> PriorityMaxAge = 7-00:00:00
> PriorityUsageResetPeriod = NONE
> PriorityType = priority/multifactor
> PriorityWeightAge = 100000
> PriorityWeightAssoc = 0
> PriorityWeightFairShare = 200000
> PriorityWeightJobSize = 0
> PriorityWeightPartition = 0
> PriorityWeightQOS = 0
> PriorityWeightTRES = (null)
>
>
> --
> best regards | pozdrawiam serdecznie
> Michał Kadlof
More information about the slurm-users
mailing list