[slurm-users] GPU + no_consume

FĂ©lix C. Morency felix.morency at gmail.com
Tue Jul 10 11:50:34 MDT 2018


Hi,
I'm currently playing with SLURM 17.11.7, cgroups and a node with 2
GPUs. Everything works fine if I set the GPU to be consumable. Cgroups
are doing their jobs and the right device is allocated to the right
job. However, it doesn't work if I set `Gres=gpu:no_consume:2`. For
some reason, SLURM doesn't allow access to the devices

  Jul 10 13:54:23 imk-dl-01 slurmstepd[2232]: debug:  Not allowing
access to device c 195:0 rwm(/dev/nvidia0) for job
Jul 10 13:54:23 imk-dl-01 slurmstepd[2232]: debug:  Not allowing
access to device c 195:1 rwm(/dev/nvidia1) for job
Jul 10 13:54:23 imk-dl-01 slurmstepd[2232]: debug:  Not allowing
access to device c 195:0 rwm(/dev/nvidia0) for step
Jul 10 13:54:23 imk-dl-01 slurmstepd[2232]: debug:  Not allowing
access to device c 195:1 rwm(/dev/nvidia1) for step

I don't understand why it doesn't work. I'm using nvidia-384 and I can
launch multiple processes on a single GPU outside of SLURM. Ideas?

Thanks,
-F



More information about the slurm-users mailing list