[slurm-users] slurm, gres:gpu, only 1 GPU out of 4 is detected

Tamas Hegedus tamas at hegelab.org
Wed Nov 13 18:11:30 UTC 2019


Thanks for your suggestion. You are right, I do not have to deal with 
specific GPUs.
(I have not tried to compile your code, I simply tested two gromacs runs 
on the same node with -gres=gpu:1 options.)

On 11/13/19 5:17 PM, Renfro, Michael wrote:
> Pretty sure you don’t need to explicitly specify GPU IDs on a Gromacs job running inside of Slurm with gres=gpu. Gromacs should only see the GPUs you have reserved for that job.
>
> Here’s a verification code you can run to verify that two different GPU jobs see different GPU devices (compile with nvcc):
>
> =====
>
> // From http://www.cs.fsu.edu/~xyuan/cda5125/examples/lect24/devicequery.cu
> #include <stdio.h>
> void printDevProp(cudaDeviceProp dP)
> {
>      printf("%s has %d multiprocessors\n", dP.name, dP.multiProcessorCount);
>      printf("%s has PCI BusID %d, DeviceID %d\n", dP.name, dP.pciBusID, dP.pciDeviceID);
> }
> int main()
> {
>      // Number of CUDA devices
>      int devCount; cudaGetDeviceCount(&devCount);
>      printf("There are %d CUDA devices.\n", devCount);
>      // Iterate through devices
>      for (int i = 0; i < devCount; ++i)
>      {
>          // Get device properties
>          printf("CUDA Device #%d: ", i);
>          cudaDeviceProp devProp; cudaGetDeviceProperties(&devProp, i);
>          printDevProp(devProp);
>      }
>      return 0;
> }
>
> =====
>
> When run from two simultaneous jobs on the same node (each with a gres=gpu), I get:
>
> =====
>
> [renfro at gpunode003(job 221584) hw]$ ./cuda_props
> There are 1 CUDA devices.
> CUDA Device #0: Tesla K80 has 13 multiprocessors
> Tesla K80 has PCI BusID 5, DeviceID 0
>
> =====
>
> [renfro at gpunode003(job 221585) hw]$ ./cuda_props
> There are 1 CUDA devices.
> CUDA Device #0: Tesla K80 has 13 multiprocessors
> Tesla K80 has PCI BusID 6, DeviceID 0
>
> =====
>

-- 
Tamas Hegedus, PhD
Senior Research Fellow
Department of Biophysics and Radiation Biology
Semmelweis University     | phone: (36) 1-459 1500/60233
Tuzolto utca 37-47        | mailto:tamas at hegelab.org
Budapest, 1094, Hungary   | http://www.hegelab.org




More information about the slurm-users mailing list