Hello,
I have compiled SLURM-24.11.3 and I have configured two GPUs in my system (slurmctld and slurmd are running in the same computer). Computes has a old processor Intel i7 with 4 cores and 4 hyperthreading. Node is configured
with “NodeName=mysystem CPUs=8 Boards=1 SocketsPerBoard=1 CoresPerSocket=4 ThreadsPerCore=2 RealMemory=7940 Gres=gpu:geforce_gtx_titan_x:1,gpu:geforce_gtx_titan_black:1”. “lscpu” command returns:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
BIOS Vendor ID: Intel(R) Corporation
CPU family: 6
Model: 26
Model name: Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz
BIOS Model name: Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz
File gres.conf is:
NodeName=mysystem Autodetect=off Name=gpu Type=geforce_gtx_titan_x File=/dev/nvidia0
CPUs=0-1
NodeName=mysystem Autodetect=off Name=gpu Type=geforce_gtx_titan_black File=/dev/nvidia1
CPUs=2-3
However, when I start daemon “slurmctld”, system returns this error:
[2025-04-28T09:35:41.003] error: _check_core_range_matches_sock: gres/gpu GRES core specification 0-1 for node aopcvis5 doesn't match socket boundaries. (Socket 0 is cores 0-3)
[2025-04-28T09:35:41.003] error: Setting node aopcvis5 state to INVAL with reason:gres/gpu GRES core specification 0-1 for node aopcvis5 doesn't match socket boundaries. (Socket 0 is cores
0-3)
Where is my configuration error?
Thanks.