[slurm-users] [External] Submitting to multiple paritions problem with gres specified
Bas van der Vlies
bas.vandervlies at surf.nl
Tue Mar 9 14:10:26 UTC 2021
For those who are interested:
* https://bugs.schedmd.com/show_bug.cgi?id=11044
On 09/03/2021 14:21, Bas van der Vlies wrote:
> I have found the problem and will submit a patch. If we find a partition
> were a job can run but all nodes are busy. Save this state and return
> this when all partitions are checked and job can not run in any.
>
> Do not know if this is the right approach
>
> regards
>
> On 09/03/2021 09:45, Bas van der Vlies wrote:
>> Hi Prentice,
>>
>> Ansers inline
>>
>> On 08/03/2021 22:02, Prentice Bisbal wrote:
>>> Rather than specifying the processor types as GRES, I would
>>> recommending defining them as features of the nodes and let the users
>>> specify the features as constraints to their jobs. Since the newer
>>> processors are backwards compatible with the older processors, list
>>> the older processors as features of the newer nodes, too.
>>>
>> We already do this with features on our other cluster. We assign nodes
>> different feature and user select these. I can add a new feature of
>> which cpu type it is. Sometime you want avx512 and specific processor.
>>
>> On other cluster we have 5 different GPU's and a lot of partitions. I
>> want to make it simple for our users. So we have a 'job_submit.lua'
>> script that submits to multiple parttions and if the user specify the
>> GRES type then slurm selects the right partition(s)
>>
>> On this cluster we do not have GPU's but i can test with other GRES type
>> 'cpu_type'. And I think the last partition in the list determines the
>> behavior. So if a use a GRES that is supported by the last partition
>> the job gets queued:
>> * srun -N1 --gres=cpu_type:e5_2650_v2 --pty /bin/bash
>> * srun --exclusive --gres=cpu_type:e5_2650_v2 --pty /bin/bash
>> srun: job 1865 queued and waiting for resources
>>
>> So to me it seems that one of the partition is BUSY but can run the
>> job. I will test it on our GPU cluster but expect the same behaviour.
>>
>>
>>>
>>> If you want to continue down the road you've already started on, can
>>> you provide more information, like the partition definitions and the
>>> gres definitions? In general, Slurm should support submitting to
>>> multiple partitions.
>>
>> slurm.conf
>> ```PartitionName=cpu_e5_2650_v1 DefMemPerCPU=11000 Default=No
>> DefaultTime=5 DisableRootJobs=YES MaxNodes=2 MaxTime=5-00
>> Nodes=r16n[18-20] OverSubscribe=EXCLUSIVE QOS=normal State=UP
>>
>>
>> PartitionName=cpu_e5_2650_v2 DefMemPerCPU=11000 Default=No
>> DefaultTime=5 DisableRootJobs=YES MaxNodes=2 MaxTime=5-00
>> Nodes=r16n[21-22] OverSubscribe=EXCLUSIVE QOS=normal State=UP
>>
>>
>> NodeName=r16n18 CoresPerSocket=8 Features=sandybridge,sse4,avx
>> Gres=cpu_type:e5_2650_v1:no_consume:4T MemSpecLimit=1024
>> NodeHostname=r16n18.mona.surfsara.nl RealMemory=188000 Sockets=2
>> State=UNKNOWN ThreadsPerCore=1 Weight=10
>>
>> NodeName=r16n21 CoresPerSocket=8 Features=sandybridge,sse4,avx
>> Gres=cpu_type:e5_2650_v2:no_consume:4T MemSpecLimit=1024
>> NodeHostname=r16n21.mona.surfsara.nl RealMemory=188000 Sockets=2
>> State=UNKNOWN ThreadsPerCore=1 Weight=10
>>
>> gres.conf
>>
>> NodeName=r16n[18-20] Count=4T Flags=CountOnly Name=cpu_type
>> Type=e5_2650_v1 NodeName=r16n[21-22] Count=4T Flags=CountOnly
>> Name=cpu_type Type=e5_2650_v2
>>
>>>
>>> Prentice
>>>
>>> On 3/8/21 11:29 AM, Bas van der Vlies wrote:
>>>> Hi,
>>>>
>>>> On this cluster I have version 20.02.6 installed. We have different
>>>> partitions for cpu type and gpu types. we want to make it easy for
>>>> the user who not care where there job runs and for the experienced
>>>> user they can specify the gres type: cpu_type or gpu
>>>>
>>>> I have defined 2 cpu partitions:
>>>> * cpu_e5_2650_v1
>>>> * cpu_e5_2650_v2
>>>>
>>>> and 2 gres cpu_type:
>>>> * e5_2650_v1
>>>> * e5_2650_v2
>>>>
>>>>
>>>> When no partitions are specified it will submit to both partitions:
>>>> * srun --exclusive --gres=cpu_type:e5_2650_v1 --pty /bin/bash -->
>>>> r16n18 wich has defined this gres and is in partition cpu_e5_2650_v1
>>>>
>>>> Now I submit at the same time another job:
>>>> * srun --exclusive --gres=cpu_type:e5_2650_v1 --pty /bin/bash
>>>>
>>>> This fails with: `srun: error: Unable to allocate resources:
>>>> Requested node configuration is not available`
>>>>
>>>> I would expect it gets queued in the partition `cpu_e5_2650_v1`.
>>>>
>>>>
>>>> When I specify the partition on the command line:
>>>> * srun --exclusive -p cpu_e5_2650_v1_shared
>>>> --gres=cpu_type:e5_2650_v1 --pty /bin/bash
>>>>
>>>> srun: job 1856 queued and waiting for resources
>>>>
>>>>
>>>> So the question is can slurm handle submitting to multiple
>>>> partitions when we specify gres attributes?
>>>>
>>>> Regards
>>>>
>>>>
>>>
>>
>
--
Bas van der Vlies
| HPCV Supercomputing | Internal Services | SURF |
https://userinfo.surfsara.nl |
| Science Park 140 | 1098 XG Amsterdam | Phone: +31208001300 |
| bas.vandervlies at surf.nl
More information about the slurm-users
mailing list