[slurm-users] Understanding gres binding

Kilian Cavalotti kilian.cavalotti.work at gmail.com
Thu May 10 11:37:25 MDT 2018


Hi Paul,

I'd first suggest to upgrade to 17.11.6, I think the first couple
17.11.x releases had some issues in terms of GRES binding.

Then, I believe you also need to request all of your cores to be
allocated on the same socket, if that's what you want. Something like
--ntasks-per-socket=16.

Here's what I have on a dual-socket, 20-core machine, with interleaved
CPU ids (hi Dell!):

$ srun -n10 --ntasks-per-socket=10 -p test --pty bash
$ lscpu | grep NUMA
NUMA node(s):          2
NUMA node0 CPU(s):     0,2,4,6,8,10,12,14,16,18
NUMA node1 CPU(s):     1,3,5,7,9,11,13,15,17,19
$ cat /sys/fs/cgroup/cpuset/$(awk -F: '/cpuset/ {print $3}'
/proc/$$/cgroup)/cpuset.cpus
1,3,5,7,9,11,13,15,17,19

Without the --ntasks-per-socket option:

$ srun -n10 -p test --pty bash
$ cat /sys/fs/cgroup/cpuset/$(awk -F: '/cpuset/ {print $3}'
/proc/$$/cgroup)/cpuset.cpus
0-9

HTH.

Cheers,
-- 
Kilian



More information about the slurm-users mailing list