[slurm-users] changing PriorityDecayHalfLife has no impact on stored accounting data
Paul Edmon
pedmon at cfa.harvard.edu
Tue Oct 16 08:10:52 MDT 2018
I'm not aware of one. This may be worth a feature request to the devs
at bugs.schedmd.com
-Paul Edmon-
On 10/16/18 7:29 AM, Antony Cleave wrote:
> Hi All
>
> Yes, I realise this is almost certainly the intended outcome. I have
> wondered this for a long time but only recently got round to testing
> it on a safe system.
>
> Process is simple run a lot of jobs
> let decay take effect
> change the setting
> restart dbd and ctld
> run another job with debug2 on the ctld
> read the log to see that the QoS stil has the same accounting number
>
> [2018-10-15T13:18:16.404] debug2: acct_policy_job_begin: after adding
> job 4304, qos normal grp_used_tres_run_secs(cpu) is 14400
> [2018-10-15T13:47:45.789] debug2: acct_policy_job_begin: after adding
> job 4304, qos normal grp_used_tres_run_secs(cpu) is 14400
>
>
> I wonder if there is a way to have Slurm recalculate the historical
> usage of users/accounts/QoS used for resource limits calculations. It
> has all of the data to do so in the database. I did try cleaning out
> all of the cluster_usage_(month|day|hour)_tables in the accounting db
> after making a backup as a bit of an experiment but this just cleans
> the state for everyone as expected
>
> for the record the full usage undecayed is:
> sacct -nP -X -D -q normal --format=CPUTimeRAW -S2018-01-01 | awk -F"|"
> 'BEGIN { sum=0; } { sum += $1; } END { print int(sum/60); }'
> 214676
> cpu minutes showing that it does indeed still have the data required
> to recalculate the usages if we wished to do it.
>
> I know that this would take quite a while to do all the hourly rollups
> but it would be useful to rebalance the system once it was realised
> the decay had been set too fast i.e. left at the default of 1 week.
>
> Also is there a way for a normal user to see the decayed usage of
> account/user/QoS? the raw usage is there (as above) but this is just
> fuels resentment when your jobs are held back by a limit and someone
> from an account with way more usage (which has decayed away to
> nothing) and the same limit is allowed to run.
>
> Antony
>
>
More information about the slurm-users
mailing list