[slurm-users] multifactor priority calculation

Loris Bennett loris.bennett at fu-berlin.de
Tue Jun 14 08:11:53 UTC 2022


z148x at arcor.de writes:

> Thank you.
>
> The output from sprio -lw:
> ---
> JOBID PARTITION     USER   PRIORITY       SITE        AGE      ASSOC
> FAIRSHARE    JOBSIZE  PARTITION        QOS        NICE                 TRES
>         Weights                                        1          0
>     0          0     100000          0          0
> cpu=1000,mem=2000
> ---
>
> The gpu is now removed, I changed this setting.
>
> To check the db data against squeue data, the results are not encouraging.
>
> It seems that the priority is changing and not updated for already
> pending jobs.
>
> ---
> job priority for id 299185 up-to-date 38499 match with 38499 - req_cpus:
> 14 req_mem: 123904
> job priority for id 299187 up-to-date 38499 match with 38499 - req_cpus:
> 14 req_mem: 123904
> job priority for id 299189 up-to-date 38499 match with 38499 - req_cpus:
> 14 req_mem: 123904
> <cut>
> KeyError! job id from squeue 299250 not in db
> KeyError! job id from squeue 299251 not in db
> job priority for id 299177 up-to-date 25932 match with 25932 - req_cpus:
> 1 req_mem: 1024
> job priority for id 299179 up-to-date 25932 match with 25932 - req_cpus:
> 1 req_mem: 1024
> job priority for id 299181 up-to-date 25932 match with 25932 - req_cpus:
> 1 req_mem: 1024
> <cut>
> job priority for id 299248 outdated 25932 to 17282 - req_cpus: 1
> req_mem: 1024
> job priority for id 299249 outdated 25932 to 17282 - req_cpus: 1
> req_mem: 1024
> job priority for id 299252 outdated 25932 to 17282 - req_cpus: 1
> req_mem: 1024
> job priority for id 299178 outdated 38499 to 25581 - req_cpus: 14
> req_mem: 123904
> job priority for id 299180 outdated 38499 to 25581 - req_cpus: 14
> req_mem: 123904
> ---
>
>
> Following job id 299178 as a short example:
> ---
> [06:33:00] job priority for id 299178 up-to-date 38499 match with 38499
> - req_cpus: 14 req_mem: 123904
> [06:33:16] job priority for id 299178 up-to-date 38499 match with 38499
> - req_cpus: 14 req_mem: 123904
> [06:33:28] job priority for id 299178 up-to-date 38499 match with 38499
> - req_cpus: 14 req_mem: 123904
> [06:33:47] job priority for id 299178 up-to-date 38499 match with 38499
> - req_cpus: 14 req_mem: 123904
> [06:34:05] job priority for id 299178 up-to-date 38499 match with 38499
> - req_cpus: 14 req_mem: 123904
> [06:36:10] job priority for id 299178 up-to-date 38499 match with 38499
> - req_cpus: 14 req_mem: 123904
> [06:43:29] job priority for id 299178 outdated 38499 to 25581 -
> req_cpus: 14 req_mem: 123904
> [06:43:29] job id 299178 pending
> [06:43:49] job priority for id 299178 outdated 38499 to 25581 -
> req_cpus: 14 req_mem: 123904
> [06:43:49] job id 299178 running
> [06:46:30] job priority for id 299178 outdated 38499 to 25581 -
> req_cpus: 14 req_mem: 123904
> [06:46:42] job priority for id 299178 outdated 38499 to 25581 -
> req_cpus: 14 req_mem: 123904
> [06:47:03] job priority for id 299178 outdated 38499 to 25581 -
> req_cpus: 14 req_mem: 123904
> ---
>
> I can see on other jobs the priority value is not updated in the db at all.
> Why are the values even changing?

I may be stating the obvious, but if a user has running jobs consuming
resources, then the priority of the waiting jobs for that user will
fall.  If not, the priority will stay the same.

Cheers,

Loris

> Regards,
> Mike
>
>
>
> On 14.06.22 01:09, Williams, Gareth (IM&T, Black Mountain) wrote:
>> Perhaps run 'sprio -l' and 'sprio -lw' to get more insight into the current priority calculation for pending jobs.
>> 
>> Gareth
>> 
>> -----Original Message-----
>> From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of z148x at arcor.de
>> Sent: Tuesday, 14 June 2022 6:09 AM
>> To: slurm-users at lists.schedmd.com
>> Subject: Re: [slurm-users] multifactor priority calculation
>> 
>> 
>> Hello Lyn,
>> only the priority settings I wrote as example are in the slurm config.
>> 
>> Maybe I found the missing peace.
>> It looks like the priority (for some jobs?) in the slurm (19.05.5) database is not updated. I retrieve these values via slurmdb over pyslurm.
>> 
>> This would be a problem for my purposes, the priority values from squeue seem to fit.
>> 
>> Is this a bug?
>> 
>> Regards,
>> Mike
>> 
>> On 13.06.22 20:50, Lyn Gerner wrote:
>>> Mike, it feels like there may be other PriorityWeight terms that are 
>>> non-zero in your config. QoS or partition-related, perhaps?
>>>
>>> Regards,
>>> Lyn
>>>
>>> On Mon, Jun 13, 2022 at 5:55 AM <z148x at arcor.de> wrote:
>>>
>>>>
>>>> Dear all,
>>>>
>>>> I noticed different priority calculations by running a pipe, the 
>>>> settings are for example:
>>>>
>>>> PriorityType=priority/multifactor
>>>> PriorityWeightJobSize=100000
>>>> AccountingStorageTRES=cpu,mem,gres/gpu
>>>> PriorityWeightTRES=cpu=1000,mem=2000,gres/gpu=3000
>>>>
>>>> No age factor or something else from the plugin.
>>>>
>>>>
>>>> The calculated priority for memory and cpus:
>>>> mem 1024, cpus 1, priority 25932
>>>> mem 123904, cpus 14, priority 38499
>>>> mem 251904, cpus 28, priority 20652
>>>> mem 251904, cpus 28, priority 14739
>>>>
>>>>
>>>> gres or gpu was not available on the jobs/instances.
>>>>
>>>>
>>>> Someone know why the priority changed with the same cpu and mem input?
>>>>
>>>> The priority with these settings should be descending, highest 
>>>> priority for mem 251904 with cpus 28 and lowest priority for mem 1024 with cpus 1.
>>>>
>>>>
>>>> Many thanks,
>>>>
>>>> Mike
>>>>
>>>>
>>>
>> 
-- 
Dr. Loris Bennett (Herr/Mr)
ZEDAT, Freie Universität Berlin         Email loris.bennett at fu-berlin.de



More information about the slurm-users mailing list