[slurm-users] GPUs not available after making use of all threads?

Mon Feb 13 15:28:49 UTC 2023

Hi Brian and Hermann,

true. This makes a lot of sense. I will consider setting up Hermann's 
configuration, defaulting to "--hint=nomultithread".

Thanks!

Sebastian

On 13.02.23 15:29, Brian Andrus wrote:
> Hermann makes a good point.
>
> The concept of hyper-threading is not doubling cores. It is a single 
> core that can 'instantly' switch work from one process to another. 
> Only one is being worked on at any given time.
>
> So if I request a single core on a hyper-threaded system, I would not 
> be pleased to find you are giving it to someone else 1/2 the time. I 
> would need to have the actual core assigned. If I request multiple 
> cores and my app is only going to affect itself, then I _may_ benefit 
> from hyper-threading.
>
> In general, enabling hyper-threading is not the best practice for 
> efficient HPC jobs. The goal is that every process is utilizing the 
> CPU as close to 100% as possible, which would render hyper-threading 
> moot.
>
> Brian Andrus
>
> On 2/13/2023 12:15 AM, Hermann Schwärzler wrote:
>> Hi Sebastian,
>>
>> I am glad I could help (although not exactly as expected :-).
>>
>> With your node-configuration you are "circumventing" how Slurm 
>> behaves, when using "CR_Core": if you read the respective part in
>>
>> https://slurm.schedmd.com/slurm.conf.html
>>
>> it says:
>>
>> "CR_Core
>>   [...] On nodes with hyper-threads, each thread is counted as a CPU 
>> to satisfy a job's resource requirement, but multiple jobs are not 
>> allocated threads on the same core."
>>
>> That's why you got a full core (both threads) when allocating a singe 
>> CPU. Or e.g. four threads when allocating three CPUs asf.
>>
>> "Lying" to Slurm about the actual hardware-setup helps to avoid this 
>> behaviour but are you really confident with potentially running two 
>> different jobs on the hyper-threads of the same core?
>>
>> Regards,
>> Hermann
>>
>> On 2/12/23 22:04, Sebastian Schmutzhard-Höfler wrote:
>>> Hi Hermann,
>>>
>>> Using your suggested settings did not work for us.
>>>
>>> When trying to allocate a single thread with --cpus-per-task=1, it 
>>> still reserved a whole CPU (two threads). On the other hand, when 
>>> requesting an even number of threads, it does what it should.
>>>
>>> However, I could make it work by using
>>>
>>> SelectTypeParameters=CR_Core
>>> NodeName=nodename Sockets=2 CoresPerSocket=128 ThreadsPerCore=1
>>>
>>> instead of
>>>
>>> SelectTypeParameters=CR_Core
>>> NodeName=nodename Sockets=2 CoresPerSocket=64 ThreadsPerCore=2
>>>
>>> So your suggestion brought me in the right direction. Thanks!
>>>
>>> If anyone thinks this is complete nonsense, please let me know!
>>>
>>> Best wishes,
>>>
>>> Sebastian
>>>
>>> On 11.02.23 11:13, Hermann Schwärzler wrote:
>>>> Hi Sebastian,
>>>>
>>>> we did a similar thing just recently.
>>>>
>>>> We changed our node settings from
>>>>
>>>> NodeName=DEFAULT CPUs=64 Boards=1 SocketsPerBoard=2 
>>>> CoresPerSocket=32 ThreadsPerCore=2
>>>>
>>>> to
>>>>
>>>> NodeName=DEFAULT Boards=1 SocketsPerBoard=2 CoresPerSocket=32 
>>>> ThreadsPerCore=2
>>>>
>>>> in order to make use of individual hyper-threads possible (we use 
>>>> this in combination with
>>>> SelectTypeParameters=CR_Core_Memory).
>>>>
>>>> This works as expected: after this, when e.g. asking for 
>>>> --cpus-per-task=4 you will get 4 hyper-threads (2 cores) per task 
>>>> (unless you also specify e.g. "--hint=nomultithread").
>>>>
>>>> So you might try to remove the "CPUs=256" part of your 
>>>> node-specification to let Slurm do that calculation of the number 
>>>> of CPUs itself.
>>>>
>>>>
>>>> BTW: on a side-note: as most of our users do not bother to use 
>>>> hyper-threads or even do not want to as their programs might suffer 
>>>> from doing so, we made "--hint=nomultithread" the default in our 
>>>> installation by adding
>>>>
>>>> CliFilterPlugins=cli_filter/lua
>>>>
>>>> to our slurm.conf and creating a cli_filter.lua file in the same 
>>>> directory as slurm.conf, that contains this
>>>>
>>>> function slurm_cli_setup_defaults(options, early_pass)
>>>>         options['hint'] = 'nomultithread'
>>>>
>>>>         return slurm.SUCCESS
>>>> end
>>>>
>>>> (see also 
>>>> https://github.com/SchedMD/slurm/blob/master/etc/cli_filter.lua.example).
>>>> So if user really want to use hyper-threads they have to add 
>>>> "--hint=multithread" to their job/allocation-options.
>>>>
>>>> Regards,
>>>> Hermann
>>>>
>>>> On 2/10/23 00:31, Sebastian Schmutzhard-Höfler wrote:
>>>>> Dear all,
>>>>>
>>>>> we have a node with 2 x 64 CPUs (with two threads each) and 8 
>>>>> GPUs, running slurm 22.05.5
>>>>>
>>>>> In order to make use of individual threads, we changed|
>>>>> |
>>>>>
>>>>> |SelectTypeParameters=CR_Core||
>>>>> NodeName=nodename CPUs=256 Sockets=2 CoresPerSocket=64 
>>>>> ThreadsPerCore=2 |
>>>>>
>>>>> to
>>>>>
>>>>> |SelectTypeParameters=CR_CPU NodeName=nodename CPUs=256|
>>>>>
>>>>> We are now able to allocate individual threads to jobs, despite 
>>>>> the following error in slurmd.log:
>>>>>
>>>>> error: Node configuration differs from hardware: CPUs=256:256(hw) 
>>>>> Boards=1:1(hw) SocketsPerBoard=256:2(hw) CoresPerSocket=1:64(hw) 
>>>>> ThreadsPerCore=1:2(hw)
>>>>>
>>>>>
>>>>> However, it appears that since this change, we can only make use 
>>>>> of 4 out of the 8 GPUs.
>>>>> The output of "sinfo -o %G" might be relevant.
>>>>>
>>>>> In the first situation it was
>>>>>
>>>>> $ sinfo -o %G
>>>>> GRES
>>>>> gpu:A100:8(S:0,1)
>>>>>
>>>>> Now it is:
>>>>>
>>>>> $ sinfo -o %G
>>>>> GRES
>>>>> gpu:A100:8(S:0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,80,82,84,86,88,90,92,94,96,98,100,102,104,106,108,110,112,114,116,118,120,122,124,126) 
>>>>>
>>>>>
>>>>> ||Has anyone faced this or a similar issue and can give me some 
>>>>> directions?
>>>>> Best wishes
>>>>>
>>>>> Sebastian
>>>>>
>>>>> ||
>>>>>
>>>>
>>>
>>
>