[slurm-users] Two jobs ends up on one GPU?

Magnus Jonsson magnus at hpc2n.umu.se
Tue Jan 15 12:15:46 UTC 2019


Hi!

We have machines with multiple GPUs (Nvidia V100).
We allow multiple (two) jobs on the nodes.

We have a user that in some way have managed to get both jobs to end up 
on the same GPU (verified via nvidia-smi).

We are using cgroups and the nvidia-smi command only shows one of the 
GPUs (if only one GPU are requested) and only the defined /dev/nvidia? 
device are accessable.

We are unable to reproduce this. Have anybody seen anything like this?

/Magnus

-- 
Magnus Jonsson, Developer, HPC2N, Umeå Universitet



More information about the slurm-users mailing list