[slurm-users] Slurm does not set CUDA_VISIBLE_DEVICES

Vladimir Goy vovagoy at gmail.com
Mon May 21 01:43:57 MDT 2018


Dear, Miguel

>Do you have --gres=gpu:0 on your job script? Is gres.conf properly
configured?

No. gres.conf worked weel on the Slurm 17.02.2.
It is my job file.

#!/bin/bash
#
#SBATCH --job-name=GPU
#SBATCH --partition=gpu
#
#SBATCH --ntasks=1
#SBATCH --mem=100
#SBATCH --time=10
#SBATCH --gres=gpu:K40:1
srun echo "Job id: $SLURM_JOB_ID"
srun echo "GPU id: $SLURM_JOB_GPUS"
srun echo "VisDev: $CUDA_VISIBLE_DEVICES"
srun sleep 120

My gres.conf:
# Configure support for our two GPUs (n[01-10])
Name=gpu Type=K40 File=/dev/nvidia0 CPUs=0
Name=gpu Type=K40 File=/dev/nvidia1 CPUs=10
Name=gpu Type=cpu      CPUs=2-9,12-19 Count=16
Name=gpu Type=debugcpu CPUs=1,11      Count=2
#CPUs 1 and 11 are reserved for system or for debug.


Thank you, for your help.
Vova.


2018-05-21 17:18 GMT+10:00 Miguel Gila <miguel.gila at cscs.ch>:

> Vova,
>
> Do you have --gres=gpu:0 on your job script? Is gres.conf properly
> configured?
>
> I think this is what sets the variable: https://github.com/
> SchedMD/slurm/blob/bcdd09d3386f4b4038ae9263b0e69d
> 4d742988b2/src/plugins/gres/gpu/gres_gpu.c#L96
>
> Cheers,
> Miguel
>
>
> On 21 May 2018, at 08:28, Vladimir Goy <vovagoy at gmail.com> wrote:
>
> Hello,
>
> Please, help me,
> I would like to ask you about the next bug. Why slurm does not set CUDA_VISIBLE_DEVICES
> variabels before runs user aplication? This bug I observe after update
> slurm 17.02.2 -> 17.11.6 . Who/How can fix this problem?
> Version 17.02.2 works well, but now I can not downgrate, because slurmdbd
> update my database.
>
> Best regards,
> Vova.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180521/98a8d661/attachment-0002.html>


More information about the slurm-users mailing list