[slurm-users] Advice on setting up fairshare
loris.bennett at fu-berlin.de
Fri Jun 7 06:11:36 UTC 2019
I have had time to look into your current problem, but inline I have
some comments about the general approach.
David Baker <D.J.Baker at soton.ac.uk> writes:
> Could someone please give me some advice on setting up the fairshare
> in a cluster. I don't think the present setup is wildly incorrect,
> however either my understanding of the setup is wrong or something is
> When we set a new user up on the cluster and they haven't used any
> resources am I correct in thinking that their fairshare (as reported
> by sshare -a) should be 1.0? Looking at a new user, I see...
> [root at blue52 slurm]# sshare -a | grep rk1n15
> soton rk1n15 1 0.003135 0 0.000000 0.822165
> This is a very simple setup. We have a number of groups (all under
> soton -- general public
> hydrology - specific groups that have purchased their own nodes.
> What I do for each of these groups, when a new user is added, is
> increment the number of shares per the relevant group using, for
> sacctmgr modify account soton set fairshare=X
> Where X is the number of users in the group (soton in this case).
I did this for years, wrote added logic to automatically
increment/decrement shares when user were added/deleted/moved, but
recently realised that for our use-case it is not necessary.
The way shares are seem to be intended to work is that some project gets
a fixed allocation on the system, or some group buys a certain number of
node for the cluster. Shares are then dished out based on the size of
the project or number of nodes and are thus fairly static.
You seem to have more of a setup like we do: a centrally financed system
which is free to use and where everyone is treated equally. What we now
do is have the Fairshare parameter for all accounts in the hierarchy set
to "Parent". This means that everyone ends up with one normalised share
and no changes have to be propagated through the hierarchy.
We also added creating the Slurm association to the submit plugin, so
that if someone applies for access, but never logs in, we can remove
them from the system after four weeks without having to clear up in
Slurm as well.
Maybe this kind of approach might work for you, too.
> The sshare -a command would give me a global overview...
> Account User RawShares NormShares RawUsage EffectvUsage FairShare
> -------------------- ---------- ---------- ----------- ----------- ------------- ----------
> root 0.000000 15431286261 1.000000
> root root 1 0.002755 40 0.000000 1.000000
> hydrology 3 0.008264 1357382 0.000088
> hydrology da1g18 1 0.333333 0 0.000000 0.876289
> Does that all make sense or am I missing something? I am, by the way,
> using the line
> PriorityFlags=ACCRUE_ALWAYS,FAIR_TREE in my slurm.conf.
> Best regards,
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin Email loris.bennett at fu-berlin.de
More information about the slurm-users