[slurm-users] 19.05 and GPUs vs GRES

Chris Samuel chris at csamuel.org
Tue Aug 13 05:25:15 UTC 2019


On Monday, 12 August 2019 11:42:48 AM PDT Christopher Benjamin Coffey wrote:

> Excuse me if this has been explained somewhere, I did some searching. With
> 19.05, is there any reason to have gres.conf on the GPU nodes? Is slurm
> smart enough to enumerate the /dev/nvidia* devices? We are moving to 19.05
> shortly, any gotchas with GRES and GPUs? Also, I'm guessing now, there is
> no reason for users to request "--gres:gpu" type stuff anymore and instead
> use: --gpus=n ?

We do have 19.05 on our GPU nodes, but I've not had time to experiment with 
the new request syntax just yet.

Regarding configuration it does appear to be that you still need to set them 
up, but if you link Slurm against the nvidia NVML library at compile time then 
there is support for autodetection.

https://slurm.schedmd.com/gres.html

# In the case of GPUs, if AutoDetect=nvml in gres.conf and the NVML library
# is installed on the node and was present during Slurm configuration, the
# missing configuration details will be automatically gathered using the
# NVML library. Configuration information about all other generic resource
# must explicitly be described in the gres.conf file. 

All the best,
Chris
-- 
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA






More information about the slurm-users mailing list