[slurm-users] GrpTRESMins and GrpTRESRaw usage
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Thu Jun 23 07:41:34 UTC 2022
Hi Bjørn-Helge,
On 6/23/22 09:18, Bjørn-Helge Mevik wrote:
> <gerard.gil at cines.fr> writes:
>
>> TRESRaw cpu is lower than before as I'm alone on the system an no other job was submitted.
>> Any explanation of this ?
>
> I'd guess you have turned on FairShare priorities. Unfortunately, in
> Slurm the same internal variables are used for fairshare calculations as
> for GrpTRESMins (and similar), so when fair share priorities are in use,
> slurm will reduce accumulated GrpTRESMins over time. This means that it
> is impossible(*) to use GrpTRESMins limits and fairshare
> priorities at the same time.
This is a surprising observation! We use a 14 days HalfLife in slurm.conf:
PriorityDecayHalfLife=14-0
Since our longest running jobs can run only 7 days, maybe our limits never
get reduced as you describe?
The slurm.conf man-page says that PriorityDecayHalfLife affects hard time
limits per association:
> PriorityDecayHalfLife
> This controls how long prior resource use is considered in
> determining how over- or under-serviced an association is (user,
> bank account and cluster) in determining job priority. The
> record of usage will be decayed over time, with half of the
> original value cleared at age PriorityDecayHalfLife. If set to
> 0 no decay will be applied. This is helpful if you want to
> enforce hard time limits per association. If set to 0 Priori‐
> tyUsageResetPeriod must be set to some interval. Applicable
> only if PriorityType=priority/multifactor. The unit is a time
> string (i.e. min, hr:min:00, days-hr:min:00, or days-hr). The
> default value is 7-0 (7 days).
Is this what explains your statement?
BTW, I've written a handy script for displaying user limits in a readable
format:
https://github.com/OleHolmNielsen/Slurm_tools/tree/master/showuserlimits
/Ole
More information about the slurm-users
mailing list