[slurm-users] 19.05 and GPUs vs GRES
Chris Samuel
chris at csamuel.org
Tue Aug 13 05:25:15 UTC 2019
On Monday, 12 August 2019 11:42:48 AM PDT Christopher Benjamin Coffey wrote:
> Excuse me if this has been explained somewhere, I did some searching. With
> 19.05, is there any reason to have gres.conf on the GPU nodes? Is slurm
> smart enough to enumerate the /dev/nvidia* devices? We are moving to 19.05
> shortly, any gotchas with GRES and GPUs? Also, I'm guessing now, there is
> no reason for users to request "--gres:gpu" type stuff anymore and instead
> use: --gpus=n ?
We do have 19.05 on our GPU nodes, but I've not had time to experiment with
the new request syntax just yet.
Regarding configuration it does appear to be that you still need to set them
up, but if you link Slurm against the nvidia NVML library at compile time then
there is support for autodetection.
https://slurm.schedmd.com/gres.html
# In the case of GPUs, if AutoDetect=nvml in gres.conf and the NVML library
# is installed on the node and was present during Slurm configuration, the
# missing configuration details will be automatically gathered using the
# NVML library. Configuration information about all other generic resource
# must explicitly be described in the gres.conf file.
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
More information about the slurm-users
mailing list