[slurm-users] multifactor priority calculation

z148x at arcor.de z148x at arcor.de
Tue Jun 14 07:32:48 UTC 2022


Thank you.

The output from sprio -lw:
---
JOBID PARTITION     USER   PRIORITY       SITE        AGE      ASSOC
FAIRSHARE    JOBSIZE  PARTITION        QOS        NICE                 TRES
        Weights                                        1          0
    0          0     100000          0          0
cpu=1000,mem=2000
---

The gpu is now removed, I changed this setting.

To check the db data against squeue data, the results are not encouraging.

It seems that the priority is changing and not updated for already
pending jobs.

---
job priority for id 299185 up-to-date 38499 match with 38499 - req_cpus:
14 req_mem: 123904
job priority for id 299187 up-to-date 38499 match with 38499 - req_cpus:
14 req_mem: 123904
job priority for id 299189 up-to-date 38499 match with 38499 - req_cpus:
14 req_mem: 123904
<cut>
KeyError! job id from squeue 299250 not in db
KeyError! job id from squeue 299251 not in db
job priority for id 299177 up-to-date 25932 match with 25932 - req_cpus:
1 req_mem: 1024
job priority for id 299179 up-to-date 25932 match with 25932 - req_cpus:
1 req_mem: 1024
job priority for id 299181 up-to-date 25932 match with 25932 - req_cpus:
1 req_mem: 1024
<cut>
job priority for id 299248 outdated 25932 to 17282 - req_cpus: 1
req_mem: 1024
job priority for id 299249 outdated 25932 to 17282 - req_cpus: 1
req_mem: 1024
job priority for id 299252 outdated 25932 to 17282 - req_cpus: 1
req_mem: 1024
job priority for id 299178 outdated 38499 to 25581 - req_cpus: 14
req_mem: 123904
job priority for id 299180 outdated 38499 to 25581 - req_cpus: 14
req_mem: 123904
---


Following job id 299178 as a short example:
---
[06:33:00] job priority for id 299178 up-to-date 38499 match with 38499
- req_cpus: 14 req_mem: 123904
[06:33:16] job priority for id 299178 up-to-date 38499 match with 38499
- req_cpus: 14 req_mem: 123904
[06:33:28] job priority for id 299178 up-to-date 38499 match with 38499
- req_cpus: 14 req_mem: 123904
[06:33:47] job priority for id 299178 up-to-date 38499 match with 38499
- req_cpus: 14 req_mem: 123904
[06:34:05] job priority for id 299178 up-to-date 38499 match with 38499
- req_cpus: 14 req_mem: 123904
[06:36:10] job priority for id 299178 up-to-date 38499 match with 38499
- req_cpus: 14 req_mem: 123904
[06:43:29] job priority for id 299178 outdated 38499 to 25581 -
req_cpus: 14 req_mem: 123904
[06:43:29] job id 299178 pending
[06:43:49] job priority for id 299178 outdated 38499 to 25581 -
req_cpus: 14 req_mem: 123904
[06:43:49] job id 299178 running
[06:46:30] job priority for id 299178 outdated 38499 to 25581 -
req_cpus: 14 req_mem: 123904
[06:46:42] job priority for id 299178 outdated 38499 to 25581 -
req_cpus: 14 req_mem: 123904
[06:47:03] job priority for id 299178 outdated 38499 to 25581 -
req_cpus: 14 req_mem: 123904
---

I can see on other jobs the priority value is not updated in the db at all.
Why are the values even changing?


Regards,
Mike



On 14.06.22 01:09, Williams, Gareth (IM&T, Black Mountain) wrote:
> Perhaps run 'sprio -l' and 'sprio -lw' to get more insight into the current priority calculation for pending jobs.
> 
> Gareth
> 
> -----Original Message-----
> From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of z148x at arcor.de
> Sent: Tuesday, 14 June 2022 6:09 AM
> To: slurm-users at lists.schedmd.com
> Subject: Re: [slurm-users] multifactor priority calculation
> 
> 
> Hello Lyn,
> only the priority settings I wrote as example are in the slurm config.
> 
> Maybe I found the missing peace.
> It looks like the priority (for some jobs?) in the slurm (19.05.5) database is not updated. I retrieve these values via slurmdb over pyslurm.
> 
> This would be a problem for my purposes, the priority values from squeue seem to fit.
> 
> Is this a bug?
> 
> Regards,
> Mike
> 
> On 13.06.22 20:50, Lyn Gerner wrote:
>> Mike, it feels like there may be other PriorityWeight terms that are 
>> non-zero in your config. QoS or partition-related, perhaps?
>>
>> Regards,
>> Lyn
>>
>> On Mon, Jun 13, 2022 at 5:55 AM <z148x at arcor.de> wrote:
>>
>>>
>>> Dear all,
>>>
>>> I noticed different priority calculations by running a pipe, the 
>>> settings are for example:
>>>
>>> PriorityType=priority/multifactor
>>> PriorityWeightJobSize=100000
>>> AccountingStorageTRES=cpu,mem,gres/gpu
>>> PriorityWeightTRES=cpu=1000,mem=2000,gres/gpu=3000
>>>
>>> No age factor or something else from the plugin.
>>>
>>>
>>> The calculated priority for memory and cpus:
>>> mem 1024, cpus 1, priority 25932
>>> mem 123904, cpus 14, priority 38499
>>> mem 251904, cpus 28, priority 20652
>>> mem 251904, cpus 28, priority 14739
>>>
>>>
>>> gres or gpu was not available on the jobs/instances.
>>>
>>>
>>> Someone know why the priority changed with the same cpu and mem input?
>>>
>>> The priority with these settings should be descending, highest 
>>> priority for mem 251904 with cpus 28 and lowest priority for mem 1024 with cpus 1.
>>>
>>>
>>> Many thanks,
>>>
>>> Mike
>>>
>>>
>>
> 



More information about the slurm-users mailing list