[slurm-users] Using GRES to manage GPUs, but unable to assign specific CPUs to specific GPUs

Julie Bernauer jbernauer at nvidia.com
Mon Sep 17 21:49:28 MDT 2018


Hi Randy,

This is expected on an HT machine, like on the one described below.  If you run lstopo, you see:
      L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5
        PU L#10 (P#5)
        PU L#11 (P#45)
Slurm uses the logical cores so 10 and 11 gives you "physical" cores 5 and 45.

Julie



________________________________
From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Randall Radmer <radmer at gmail.com>
Sent: Wednesday, September 12, 2018 10:14 AM
To: slurm-users at lists.schedmd.com
Subject: [slurm-users] Using GRES to manage GPUs, but unable to assign specific CPUs to specific GPUs

I’m using GRES to manage eight GPUs in a node on a new Slurm cluster and am trying to bind specific CPUs to specific GPUs, but it’s not working as I expected.

I am able to request a specific number of GPUs, but the CPU assignment seems wrong.

I assume I’m missing something obvious, but just can't find it.  Any suggestion for how to fix it, or how to better investigate the problem, would be much appreciated.

Example srun requesting one GPU follows:
$ srun -p dgx1 --gres=gpu:1 --pty $SHELL
[node-01:~]$ nvidia-smi --query-gpu=index,name --format=csv
index, name
0, Tesla V100-SXM2-16GB
[node-01:~]$ cat /sys/fs/cgroup/cpuset/slurm/uid_*/job_*/cpuset.cpus
5,45

Similar example requesting eight GPUs follows:
$ srun -p dgx1 --gres=gpu:8 --pty $SHELL
[node-01:~]$ nvidia-smi --query-gpu=index,name --format=csv
index, name
0, Tesla V100-SXM2-16GB
1, Tesla V100-SXM2-16GB
2, Tesla V100-SXM2-16GB
3, Tesla V100-SXM2-16GB
4, Tesla V100-SXM2-16GB
5, Tesla V100-SXM2-16GB
6, Tesla V100-SXM2-16GB
7, Tesla V100-SXM2-16GB
[node-01:~]$ cat /sys/fs/cgroup/cpuset/slurm/uid_*/job_*/cpuset.cpus
5,45

The machines are all Ubuntu 16.04 and Slurm version is 17.11.9-2.

The /etc/slurm/gres.conf file follows:
[node-01:~]$ less /etc/slurm/gres.conf
Name=gpu Type=V100 File=/dev/nvidia0 Cores=10-11
Name=gpu Type=V100 File=/dev/nvidia1 Cores=12-13
Name=gpu Type=V100 File=/dev/nvidia2 Cores=14-15
Name=gpu Type=V100 File=/dev/nvidia3 Cores=16-17
Name=gpu Type=V100 File=/dev/nvidia4 Cores=18-19
Name=gpu Type=V100 File=/dev/nvidia5 Cores=20-21
Name=gpu Type=V100 File=/dev/nvidia6 Cores=22-23
Name=gpu Type=V100 File=/dev/nvidia7 Cores=24-25

The /etc/slurm/slurm.conf file on all machines in the cluster follows (with minor cleanup):
ClusterName=testcluster
ControlMachine=slurm-master
SlurmUser=slurm
SlurmctldPort=6817
SlurmdPort=6818
AuthType=auth/munge
SlurmdSpoolDir=/var/spool/slurm/d
SwitchType=switch/none
MpiDefault=none
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
ProctrackType=proctrack/cgroup
PluginDir=/usr/lib/slurm
ReturnToService=2
Prolog=/etc/slurm/slurm.prolog
PrologSlurmctld=/etc/slurm/slurm.ctld.prolog
Epilog=/etc/slurm/slurm.epilog
EpilogSlurmctld=/etc/slurm/slurm.ctld.epilog
TaskProlog=/etc/slurm/slurm.task.prolog
TaskPlugin=task/affinity,task/cgroup
SlurmctldTimeout=300
SlurmdTimeout=300
InactiveLimit=0
MinJobAge=300
KillWait=20
Waittime=0
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory
FastSchedule=0
DebugFlags=CPU_Bind,gres
SlurmctldDebug=debug5
SlurmctldLogFile=/var/log/slurm/slurmctld.log
SlurmdDebug=3
SlurmdLogFile=/var/log/slurm/slurmd.log
JobCompType=jobcomp/filetxt
JobCompLoc=/data/slurm/job_completions.log
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageLoc=/data/slurm/accounting_storage.log
AccountingStorageEnforce=associations,limits,qos
AccountingStorageTRES=gres/gpu,gres/gpu:V100
PreemptMode=SUSPEND,GANG
PrologFlags=Serial,Alloc
RebootProgram="/sbin/shutdown -r 3"
PreemptType=preempt/partition_prio
CacheGroups=0
DefMemPerCPU=2048
GresTypes=gpu
NodeName=node-01 State=UNKNOWN \
                 Sockets=2 CoresPerSocket=20 ThreadsPerCore=2 \
                 Gres=gpu:V100:8
PartitionName=all Nodes=node-01 \
                  Default=YES MaxTime=4:0:0 DefaultTime=4:0:0 State=UP


Thanks,
Randy


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180918/fa0131b5/attachment-0001.html>


More information about the slurm-users mailing list