[slurm-users] GRES and GPUs

Xaver Stiensmeier xaverstiensmeier at gmx.de
Mon Jul 17 11:43:59 UTC 2023


I am currently trying to understand how I can schedule a job that needs
a GPU.

I read about GRES https://slurm.schedmd.com/gres.html and tried to use:

NodeName=test Gres=gpu:1

But calling - after a 'sudo scontrol reconfigure':

srun --gpus 1 hostname

didn't work:

srun: error: Unable to allocate resources: Invalid generic resource (gres) specification

so I read more https://slurm.schedmd.com/gres.conf.html but that didn't
really help me.

I am rather confused. GRES claims to be generic resources but then it
comes with three defined resources (GPU, MPS, MIG) and using one of
those didn't work in my case.

Obviously, I am misunderstanding something, but I am unsure where to look.

Best regards,
Xaver Stiensmeier
