[slurm-users] CUDA environment variable not being set

Relu Patrascu relu at cs.toronto.edu
Thu Oct 8 20:25:57 UTC 2020


That usually means you don't have the nvidia kernel module loaded, 
probably because there's no driver installed.

Relu

On 2020-10-08 14:57, Sajesh Singh wrote:
>
> Slurm 18.08
>
> CentOS 7.7.1908
>
> I have 2 M500 GPUs in a compute node which is defined in the 
> slurm.conf and gres.conf of the cluster, but if I launch a job 
> requesting GPUs the environment variable CUDA_VISIBLE_DEVICES Is never 
> set and I see the following messages in the slurmd.log file:
>
> debug:  common_gres_set_env: unable to set env vars, no device files 
> configured
>
> Has anyone encountered this before?
>
> Thank you,
>
> SS
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20201008/f2e23e4e/attachment.htm>


More information about the slurm-users mailing list