[slurm-users] Understanding gres binding
Kilian Cavalotti
kilian.cavalotti.work at gmail.com
Thu May 10 11:37:25 MDT 2018
Hi Paul,
I'd first suggest to upgrade to 17.11.6, I think the first couple
17.11.x releases had some issues in terms of GRES binding.
Then, I believe you also need to request all of your cores to be
allocated on the same socket, if that's what you want. Something like
--ntasks-per-socket=16.
Here's what I have on a dual-socket, 20-core machine, with interleaved
CPU ids (hi Dell!):
$ srun -n10 --ntasks-per-socket=10 -p test --pty bash
$ lscpu | grep NUMA
NUMA node(s): 2
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19
$ cat /sys/fs/cgroup/cpuset/$(awk -F: '/cpuset/ {print $3}'
/proc/$$/cgroup)/cpuset.cpus
1,3,5,7,9,11,13,15,17,19
Without the --ntasks-per-socket option:
$ srun -n10 -p test --pty bash
$ cat /sys/fs/cgroup/cpuset/$(awk -F: '/cpuset/ {print $3}'
/proc/$$/cgroup)/cpuset.cpus
0-9
HTH.
Cheers,
--
Kilian
More information about the slurm-users
mailing list