[slurm-users] GPUs not available after making use of all threads?

Mon Feb 13 14:29:59 UTC 2023

Hermann makes a good point.

The concept of hyper-threading is not doubling cores. It is a single 
core that can 'instantly' switch work from one process to another. Only 
one is being worked on at any given time.

So if I request a single core on a hyper-threaded system, I would not be 
pleased to find you are giving it to someone else 1/2 the time. I would 
need to have the actual core assigned. If I request multiple cores and 
my app is only going to affect itself, then I _may_ benefit from 
hyper-threading.

In general, enabling hyper-threading is not the best practice for 
efficient HPC jobs. The goal is that every process is utilizing the CPU 
as close to 100% as possible, which would render hyper-threading moot.

Brian Andrus

On 2/13/2023 12:15 AM, Hermann Schwärzler wrote:
> Hi Sebastian,
>
> I am glad I could help (although not exactly as expected :-).
>
> With your node-configuration you are "circumventing" how Slurm 
> behaves, when using "CR_Core": if you read the respective part in
>
> https://slurm.schedmd.com/slurm.conf.html
>
> it says:
>
> "CR_Core
>   [...] On nodes with hyper-threads, each thread is counted as a CPU 
> to satisfy a job's resource requirement, but multiple jobs are not 
> allocated threads on the same core."
>
> That's why you got a full core (both threads) when allocating a singe 
> CPU. Or e.g. four threads when allocating three CPUs asf.
>
> "Lying" to Slurm about the actual hardware-setup helps to avoid this 
> behaviour but are you really confident with potentially running two 
> different jobs on the hyper-threads of the same core?
>
> Regards,
> Hermann
>
> On 2/12/23 22:04, Sebastian Schmutzhard-Höfler wrote:
>> Hi Hermann,
>>
>> Using your suggested settings did not work for us.
>>
>> When trying to allocate a single thread with --cpus-per-task=1, it 
>> still reserved a whole CPU (two threads). On the other hand, when 
>> requesting an even number of threads, it does what it should.
>>
>> However, I could make it work by using
>>
>> SelectTypeParameters=CR_Core
>> NodeName=nodename Sockets=2 CoresPerSocket=128 ThreadsPerCore=1
>>
>> instead of
>>
>> SelectTypeParameters=CR_Core
>> NodeName=nodename Sockets=2 CoresPerSocket=64 ThreadsPerCore=2
>>
>> So your suggestion brought me in the right direction. Thanks!
>>
>> If anyone thinks this is complete nonsense, please let me know!
>>
>> Best wishes,
>>
>> Sebastian
>>
>> On 11.02.23 11:13, Hermann Schwärzler wrote:
>>> Hi Sebastian,
>>>
>>> we did a similar thing just recently.
>>>
>>> We changed our node settings from
>>>
>>> NodeName=DEFAULT CPUs=64 Boards=1 SocketsPerBoard=2 
>>> CoresPerSocket=32 ThreadsPerCore=2
>>>
>>> to
>>>
>>> NodeName=DEFAULT Boards=1 SocketsPerBoard=2 CoresPerSocket=32 
>>> ThreadsPerCore=2
>>>
>>> in order to make use of individual hyper-threads possible (we use 
>>> this in combination with
>>> SelectTypeParameters=CR_Core_Memory).
>>>
>>> This works as expected: after this, when e.g. asking for 
>>> --cpus-per-task=4 you will get 4 hyper-threads (2 cores) per task 
>>> (unless you also specify e.g. "--hint=nomultithread").
>>>
>>> So you might try to remove the "CPUs=256" part of your 
>>> node-specification to let Slurm do that calculation of the number of 
>>> CPUs itself.
>>>
>>>
>>> BTW: on a side-note: as most of our users do not bother to use 
>>> hyper-threads or even do not want to as their programs might suffer 
>>> from doing so, we made "--hint=nomultithread" the default in our 
>>> installation by adding
>>>
>>> CliFilterPlugins=cli_filter/lua
>>>
>>> to our slurm.conf and creating a cli_filter.lua file in the same 
>>> directory as slurm.conf, that contains this
>>>
>>> function slurm_cli_setup_defaults(options, early_pass)
>>>         options['hint'] = 'nomultithread'
>>>
>>>         return slurm.SUCCESS
>>> end
>>>
>>> (see also 
>>> https://github.com/SchedMD/slurm/blob/master/etc/cli_filter.lua.example).
>>> So if user really want to use hyper-threads they have to add 
>>> "--hint=multithread" to their job/allocation-options.
>>>
>>> Regards,
>>> Hermann
>>>
>>> On 2/10/23 00:31, Sebastian Schmutzhard-Höfler wrote:
>>>> Dear all,
>>>>
>>>> we have a node with 2 x 64 CPUs (with two threads each) and 8 GPUs, 
>>>> running slurm 22.05.5
>>>>
>>>> In order to make use of individual threads, we changed|
>>>> |
>>>>
>>>> |SelectTypeParameters=CR_Core||
>>>> NodeName=nodename CPUs=256 Sockets=2 CoresPerSocket=64 
>>>> ThreadsPerCore=2 |
>>>>
>>>> to
>>>>
>>>> |SelectTypeParameters=CR_CPU NodeName=nodename CPUs=256|
>>>>
>>>> We are now able to allocate individual threads to jobs, despite the 
>>>> following error in slurmd.log:
>>>>
>>>> error: Node configuration differs from hardware: CPUs=256:256(hw) 
>>>> Boards=1:1(hw) SocketsPerBoard=256:2(hw) CoresPerSocket=1:64(hw) 
>>>> ThreadsPerCore=1:2(hw)
>>>>
>>>>
>>>> However, it appears that since this change, we can only make use of 
>>>> 4 out of the 8 GPUs.
>>>> The output of "sinfo -o %G" might be relevant.
>>>>
>>>> In the first situation it was
>>>>
>>>> $ sinfo -o %G
>>>> GRES
>>>> gpu:A100:8(S:0,1)
>>>>
>>>> Now it is:
>>>>
>>>> $ sinfo -o %G
>>>> GRES
>>>> gpu:A100:8(S:0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,80,82,84,86,88,90,92,94,96,98,100,102,104,106,108,110,112,114,116,118,120,122,124,126) 
>>>>
>>>>
>>>> ||Has anyone faced this or a similar issue and can give me some 
>>>> directions?
>>>> Best wishes
>>>>
>>>> Sebastian
>>>>
>>>> ||
>>>>
>>>
>>
>