[slurm-users] How to view GPU indices of the completed jobs?

Fri Jun 26 13:19:02 UTC 2020

In regard to Kota's initial question

... "Is there any way (commands, configurations, etc...) to see the 
allocated GPU indices for completed jobs?" ...

I was in need of the same kind of information and found the following:

If

- ConstrainDevices is on
- SlurmdDebug is set to at least "debug"

The device number from the nvidia kernel driver can be found by grepping 
for job id (1143) and "Allowing access to device" in slurmd.log on a GPU 
node:

[2020-06-25T20:51:47.219] [1143.0] debug:  Allowing access to device c 
195:0 rwm(/dev/nvidia0) for job
[2020-06-25T20:51:47.220] [1143.0] debug:  Allowing access to device c 
195:0 rwm(/dev/nvidia0) for step

As far as I observed the device numnber matches the PCI device minor 
number. If anyone can confirm the inner workings of the (proprietary) 
kernel module, I'd be glad to know.

The rest of the information I needed I got from 
/proc/driver/nvidia/gpus/<PCI BUS location>/information.

Cheers,
Stephan

On 24.06.20 07:05, Marcus Wagner wrote:
> Hi Taras,
> 
> no we have set ConstrainDevices to "yes".
> And this is, why CUDA_VISIBLE_DEVICES starts from zero.
> 
> Otherwise both below mentioned jobs would have been on one GPU, but as 
> nvidia-smi shows clearly (did not show the output this time, see earlier 
> post), both GPUs are used, environment of both jobs includes 
> CUDA_VISIBLE_DEVICES=0.
> 
> Kota, might it be, that you did not configure ConstrainDevices in 
> cgroup.conf? The default is "no" according to the manpage.
> That way, a user could set CUDA_VISIBLE_DEVICES in his job and therefore 
> use GPUs they did not request.
> 
> Best
> Marcus
> 
> Am 23.06.2020 um 15:41 schrieb Taras Shapovalov:
>> Hi Marcus,
>>
>> This may depend on ConstrainDevices in cgroups.conf. I guess it is set 
>> to "no" in your case.
>>
>> Best regards,
>> Taras
>>
>> On Tue, Jun 23, 2020 at 4:02 PM Marcus Wagner 
>> <wagner at itc.rwth-aachen.de <mailto:wagner at itc.rwth-aachen.de>> wrote:
>>
>>     Hi Kota,
>>
>>     thanks for the hint.
>>
>>     Yet, I'm still a little bit astonished, as if I remember right,
>>     CUDA_VISIBLE_DEVICES in a cgroup always start from zero. That has 
>> been
>>     already years ago, as we still used LSF.
>>
>>     But SLURM_JOB_GPUS seems to be the right thing:
>>
>>     same node, two different users (and therefore jobs)
>>
>>
>>     $> xargs --null --max-args=1 echo < /proc/32719/environ | egrep
>>     "GPU|CUDA"
>>     SLURM_JOB_GPUS=0
>>     CUDA_VISIBLE_DEVICES=0
>>     GPU_DEVICE_ORDINAL=0
>>
>>     $> xargs --null --max-args=1 echo < /proc/109479/environ | egrep
>>     "GPU|CUDA"
>>     SLURM_MEM_PER_GPU=6144
>>     SLURM_JOB_GPUS=1
>>     CUDA_VISIBLE_DEVICES=0
>>     GPU_DEVICE_ORDINAL=0
>>     CUDA_ROOT=/usr/local_rwth/sw/cuda/10.1.243
>>     CUDA_PATH=/usr/local_rwth/sw/cuda/10.1.243
>>     CUDA_VERSION=101
>>
>>     SLURM_JOB_GPU differs
>>
>>     $> scontrol show -d job 14658274
>>     ...
>>     Nodes=nrg02 CPU_IDs=24 Mem=8192 GRES_IDX=gpu:volta(IDX:1)
>>
>>     $> scontrol show -d job 14673550
>>     ...
>>     Nodes=nrg02 CPU_IDs=0 Mem=8192 GRES_IDX=gpu:volta(IDX:0)
>>
>>
>>
>>     Is there anyone out there, who can confirm this besides me?
>>
>>
>>     Best
>>     Marcus
>>
>>
>>     Am 23.06.2020 um 04:51 schrieb Kota Tsuyuzaki:
>>      >> if I remember right, if you use cgroups, CUDA_VISIBLE_DEVICES 
>> always
>>      >> starts from zero. So this is NOT the index of the GPU.
>>      >
>>      > Thanks. Just FYI, when I tested the environment variables with
>>     Slurm 19.05.2 + proctrack/cgroup configuration, It looks
>>     CUDA_VISIBLE_DEVICES fits the indices on the host devices (i.e. not
>>     started from zero). I'm not sure if the behavior would be changed in
>>     the newer Slurm version though.
>>      >
>>      > I also found that SLURM_JOB_GPUS and GPU_DEVICE_ORDIGNAL was set
>>     in environment variables that can be useful. In my current tests,
>>     those variables ware being same values with CUDA_VISILE_DEVICES.
>>      >
>>      > Any advices on what I should look for, is always welcome..
>>      >
>>      > Best,
>>      > Kota
>>      >
>>      >> -----Original Message-----
>>      >> From: slurm-users <slurm-users-bounces at lists.schedmd.com
>>     <mailto:slurm-users-bounces at lists.schedmd.com>> On Behalf Of Marcus
>>     Wagner
>>      >> Sent: Tuesday, June 16, 2020 9:17 PM
>>      >> To: slurm-users at lists.schedmd.com
>>     <mailto:slurm-users at lists.schedmd.com>
>>      >> Subject: Re: [slurm-users] How to view GPU indices of the
>>     completed jobs?
>>      >>
>>      >> Hi David,
>>      >>
>>      >> if I remember right, if you use cgroups, CUDA_VISIBLE_DEVICES 
>> always
>>      >> starts from zero. So this is NOT the index of the GPU.
>>      >>
>>      >> Just verified it:
>>      >> $> nvidia-smi
>>      >> Tue Jun 16 13:28:47 2020
>>      >>
>>     
>> +-----------------------------------------------------------------------------+ 
>>
>>      >> | NVIDIA-SMI 440.44       Driver Version: 440.44       CUDA 
>> Version:
>>      >> 10.2     |
>>      >> ...
>>      >>
>>     
>> +-----------------------------------------------------------------------------+ 
>>
>>      >> | Processes:         GPU
>>      >> Memory |
>>      >> |  GPU       PID   Type   Process name         Usage
>>      >>        |
>>      >>
>>     
>> |========================================================================= 
>>
>>      >> ====|
>>      >> |    0     17269      C   gmx_mpi
>>      >> 679MiB |
>>      >> |    1     19246      C   gmx_mpi
>>      >> 513MiB |
>>      >>
>>     
>> +-----------------------------------------------------------------------------+ 
>>
>>      >>
>>      >> $> squeue -w nrg04
>>      >>                JOBID PARTITION     NAME     USER ST       TIME 
>>     NODES
>>      >> NODELIST(REASON)
>>      >>             14560009  c18g_low     egf5 bk449967  R 1-00:17:48 
>>          1 nrg04
>>      >>             14560005  c18g_low     egf1 bk449967  R 1-00:20:23 
>>          1 nrg04
>>      >>
>>      >>
>>      >> $> scontrol show job -d 14560005
>>      >> ...
>>      >>      Socks/Node=* NtasksPerN:B:S:C=24:0:*:* CoreSpec=*
>>      >>        Nodes=nrg04 CPU_IDs=0-23 Mem=93600 GRES_IDX=gpu(IDX:0)
>>      >>
>>      >> $> scontrol show job -d 14560009
>>      >> JobId=14560009 JobName=egf5
>>      >> ...
>>      >>      Socks/Node=* NtasksPerN:B:S:C=24:0:*:* CoreSpec=*
>>      >>        Nodes=nrg04 CPU_IDs=24-47 Mem=93600 GRES_IDX=gpu(IDX:1)
>>      >>
>>      >>   From the PIDs from nvidia-smi ouput:
>>      >>
>>      >> $> xargs --null --max-args=1 echo < /proc/17269/environ | grep
>>     CUDA_VISIBLE
>>      >> CUDA_VISIBLE_DEVICES=0
>>      >>
>>      >> $> xargs --null --max-args=1 echo < /proc/19246/environ | grep
>>     CUDA_VISIBLE
>>      >> CUDA_VISIBLE_DEVICES=0
>>      >>
>>      >>
>>      >> So this is only a way to see how MANY devices were used, not 
>> which.
>>      >>
>>      >>
>>      >> Best
>>      >> Marcus
>>      >>
>>      >> Am 10.06.2020 um 20:49 schrieb David Braun:
>>      >>> Hi Kota,
>>      >>>
>>      >>> This is from the job template that I give to my users:
>>      >>>
>>      >>> # Collect some information about the execution environment 
>> that may
>>      >>> # be useful should we need to do some debugging.
>>      >>>
>>      >>> echo "CREATING DEBUG DIRECTORY"
>>      >>> echo
>>      >>>
>>      >>> mkdir .debug_info
>>      >>> module list > .debug_info/environ_modules 2>&1
>>      >>> ulimit -a > .debug_info/limits 2>&1
>>      >>> hostname > .debug_info/environ_hostname 2>&1
>>      >>> env |grep SLURM > .debug_info/environ_slurm 2>&1
>>      >>> env |grep OMP |grep -v OMPI > .debug_info/environ_omp 2>&1
>>      >>> env |grep OMPI > .debug_info/environ_openmpi 2>&1
>>      >>> env > .debug_info/environ 2>&1
>>      >>>
>>      >>> if [ ! -z ${CUDA_VISIBLE_DEVICES+x} ]; then
>>      >>>           echo "SAVING CUDA ENVIRONMENT"
>>      >>>           echo
>>      >>>           env |grep CUDA > .debug_info/environ_cuda 2>&1
>>      >>> fi
>>      >>>
>>      >>> You could add something like this to one of the SLURM prologs
>>     to save
>>      >>> the GPU list of jobs.
>>      >>>
>>      >>> Best,
>>      >>>
>>      >>> David
>>      >>>
>>      >>> On Thu, Jun 4, 2020 at 4:02 AM Kota Tsuyuzaki
>>      >>> <kota.tsuyuzaki.pc at hco.ntt.co.jp
>>     <mailto:kota.tsuyuzaki.pc at hco.ntt.co.jp>
>>      >>> <mailto:kota.tsuyuzaki.pc at hco.ntt.co.jp
>>     <mailto:kota.tsuyuzaki.pc at hco.ntt.co.jp>>> wrote:
>>      >>>
>>      >>>      Hello Guys,
>>      >>>
>>      >>>      We are running GPU clusters with Slurm and SlurmDBD
>>     (version 19.05
>>      >>>      series) and some of GPUs seemed to get troubles for 
>> attached
>>      >>>      jobs. To investigate if the troubles happened on the same
>>     GPUs, I'd
>>      >>>      like to get GPU indices of the completed jobs.
>>      >>>
>>      >>>      In my understanding `scontrol show job` can show the
>>     indices (as IDX
>>      >>>      in gres info) but cannot be used for completed job. And 
>> also
>>      >>>      `sacct -j` is available for complete jobs but won't print
>>     the indices.
>>      >>>
>>      >>>      Is there any way (commands, configurations, etc...) to 
>> see the
>>      >>>      allocated GPU indices for completed jobs?
>>      >>>
>>      >>>      Best regards,
>>      >>>
>>      >>>      --------------------------------------------
>>      >>>      露崎　浩太 (Kota Tsuyuzaki)
>>      >>> kota.tsuyuzaki.pc at hco.ntt.co.jp
>>     <mailto:kota.tsuyuzaki.pc at hco.ntt.co.jp>
>>     <mailto:kota.tsuyuzaki.pc at hco.ntt.co.jp
>>     <mailto:kota.tsuyuzaki.pc at hco.ntt.co.jp>>
>>      >>>      NTTソフトウェアイノベーションセンタ
>>      >>>      分散処理基盤技術プロジェクト
>>      >>>      0422-59-2837
>>      >>>      ---------------------------------------------
>>      >>>
>>      >>>
>>      >>>
>>      >>>
>>      >>>
>>      >>
>>      >> --
>>      >> Dipl.-Inf. Marcus Wagner
>>      >>
>>      >> IT Center
>>      >> Gruppe: Systemgruppe Linux
>>      >> Abteilung: Systeme und Betrieb
>>      >> RWTH Aachen University
>>      >> Seffenter Weg 23
>>      >> 52074 Aachen
>>      >> Tel: +49 241 80-24383
>>      >> Fax: +49 241 80-624383
>>      >> wagner at itc.rwth-aachen.de <mailto:wagner at itc.rwth-aachen.de>
>>      >> www.itc.rwth-aachen.de <http://www.itc.rwth-aachen.de>
>>      >>
>>      >> Social Media Kanäle des IT Centers:
>>      >> https://blog.rwth-aachen.de/itc/
>>      >> https://www.facebook.com/itcenterrwth
>>      >> https://www.linkedin.com/company/itcenterrwth
>>      >> https://twitter.com/ITCenterRWTH
>>      >> https://www.youtube.com/channel/UCKKDJJukeRwO0LP-ac8x8rQ
>>      >
>>      >
>>      >
>>      >
>>
>>     --     Dipl.-Inf. Marcus Wagner
>>
>>     IT Center
>>     Gruppe: Systemgruppe Linux
>>     Abteilung: Systeme und Betrieb
>>     RWTH Aachen University
>>     Seffenter Weg 23
>>     52074 Aachen
>>     Tel: +49 241 80-24383
>>     Fax: +49 241 80-624383
>>     wagner at itc.rwth-aachen.de <mailto:wagner at itc.rwth-aachen.de>
>>     www.itc.rwth-aachen.de <http://www.itc.rwth-aachen.de>
>>
>>     Social Media Kanäle des IT Centers:
>>     https://blog.rwth-aachen.de/itc/
>>     https://www.facebook.com/itcenterrwth
>>     https://www.linkedin.com/company/itcenterrwth
>>     https://twitter.com/ITCenterRWTH
>>     https://www.youtube.com/channel/UCKKDJJukeRwO0LP-ac8x8rQ