[slurm-users] Using gpus-per-task for allocation of distinct GPUs

Andrei Berceanu andreicberceanu at gmail.com
Sun Feb 7 21:42:49 UTC 2021


Under slurm 19.05.2 with SelectType=select/cons_tres, I was no able to
allocate distinct GPUs on the same node via this script

#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --ntasks=2
#SBATCH --tasks-per-node=2
#SBATCH --cpus-per-task=4
#SBATCH --gres=gpu:2
#SBATCH --gpus-per-task=1

srun --ntasks=2 --gres=gpu:1 nvidia-smi -L

GPU 0: Tesla V100-SXM3-32GB (UUID: GPU-c55b3036-d54d-a885-7c6c-4238840c836e)
GPU 0: Tesla V100-SXM3-32GB (UUID: GPU-c55b3036-d54d-a885-7c6c-4238840c836e)

I was expecting the GPU index, as well as UUID to be different between
the two tasks, but instead they ran on the same GPU.

I am aware of the `CUDA_VISIBLE_DEVICES` workaround, but I though
--gpus-per-task could solve this elegantly. Am I missing something?

Best,
Andrei

Note: cross-posted https://stackoverflow.com/q/66092965/10260561



More information about the slurm-users mailing list