[slurm-users] GPU to CPU affinity, topo_core_bitmap

Sun May 21 13:07:47 UTC 2023

Hello everybody,

I have a problem with my gpu nodes, which I believe could be rather
special. This is slurm 23.02.2.

I tried to add new gres to the nodes, which did not work at first. 
After quite some prodding I have all gres, but now my GPU to CPU 
affinity (?) or at least the scontrol gres definition of one node 
looks odd.

There are four gpu nodes, and one of them (gpu001, on which I did some
trial and error experiments) is now different from the others.

$ scontrol show node=gpu001|grep Gres
   Gres=gpu:a100:4,localdisk:490,ramdisk:100
$ scontrol show node=gpu002|grep Gres
   Gres=gpu:a100:4(S:0-1),localdisk:490,ramdisk:100

Notice the missing socket indicator (S:0-1) in the scontrol output.

Running slurmctld with DebugFlags=Gres shows further (abbreviated):
[2023-05-21T14:49:50.503] gres/gpu: state for gpu001
[2023-05-21T14:49:50.503]   gres_cnt found:4 configured:4 avail:4
alloc:0                                                 
[2023-05-21T14:49:50.503]   gres_bit_alloc: of 4
[2023-05-21T14:49:50.503]   gres_used:(null)
[2023-05-21T14:49:50.503]   topo[0]:a100(808464737)
[2023-05-21T14:49:50.503]    topo_core_bitmap[0]:NULL

while for the other gpu nodes

[2023-05-21T14:49:50.506] gres/gpu: state for gpu002
[2023-05-21T14:49:50.506]   gres_cnt found:4 configured:4 avail:4
alloc:0
[2023-05-21T14:49:50.506]   gres_bit_alloc: of 4
[2023-05-21T14:49:50.506]   gres_used:(null)
[2023-05-21T14:49:50.506]   topo[0]:a100(808464737)
[2023-05-21T14:49:50.506]    topo_core_bitmap[0]:36-47 of 48

Notice the different topo_core_bitmap field.

The nodes are stateless (ramdisk based) and have been rebooted, 
slurmctld, slurmdbd and slurmd's have been restarted multiple times, and
scontrol reconfigure has been executed multiple times.

I have no clear idea how they ended up in this state and also nothing I
try appears to be able to "synchronize" them again. Especially changing
the "Cores=" values in gres.conf changes neither topo_core_bitmap field.
Of course I naively expect that it should change.
It is as if the "Cores=" values are ignored. Note that AutoDetect should
not work on our system, slurm has not been built with nvml support.

Any ideas ? Also, please let me know if further information is needed.

Our gres.conf:

NodeName=gpu[001-004] Name=gpu Type=a100 File=/dev/nvidia0 Cores=0-11
NodeName=gpu[001-004] Name=gpu Type=a100 File=/dev/nvidia1 Cores=12-23
NodeName=gpu[001-004] Name=gpu Type=a100 File=/dev/nvidia2 Cores=24-35
NodeName=gpu[001-004] Name=gpu Type=a100 File=/dev/nvidia3 Cores=36-43
NodeName=gpu[001-004] Name=localdisk Flags=CountOnly
NodeName=gpu[001-004] Name=ramdisk Flags=CountOnly

And extract from slurm.conf:

# grep gpu slurm.conf 
GresTypes=gpu,localdisk,ramdisk
AccountingStorageTRES=gres/gpu,gres/gpu:a100,gres/localdisk,gres/ramdisk

NodeName=gpu[001-004] Boards=1 SocketsPerBoard=2 CoresPerSocket=24
ThreadsPerCore=1 RealMemory=500000 Feature=cpu6342,ram512,gpuA100
Gres=gpu:a100:4,localdisk:490,ramdisk:100

PartitionName=gpu Nodes=gpu[001-004] MaxTime=72:00:00 OverTimeLimit=5
DefMemPerCPU=4096 MaxMemPerCPU=7500 State=UP

Best Regards

Christof

-- 
Dr. rer. nat. Christof Köhler       email: c.koehler at uni-bremen.de
Universitaet Bremen/FB1/BCCMS       phone:  +49-(0)421-218-62334
Am Fallturm 1/ TAB/ Raum 3.06       fax: +49-(0)421-218-62770
28359 Bremen