[slurm-users] Job requesting two different GPUs on two different nodes

Diego Zuccato diego.zuccato at unibo.it
Thu Jun 10 11:46:56 UTC 2021


Il 10/06/2021 11:35, Gestió Servidors ha scritto:

I'm no SLURM expert, but a jobfile like this should work:

#!/bin/bash
#
#SBATCH --job-name=N2n4
#SBATCH --partition=cuda.q
#SBATCH --output=N2n4-CUDA.txt
#SBATCH -N 1 # number of nodes with the first GPU
#SBATCH -n 2 # number of cores
#SBATCH --gres=gpu:GeForceRTX3080:1
#SBATCH hetjob
#SBATCH -N 1 # number of nodes with the second GPU
#SBATCH -n 2 # number of cores
#SBATCH --gres=gpu:GeForceRTX2070:1
...

I can't test (we haven't GPU nodes), but that's how it should work, 
according to (my understanding of) the docs. Maybe someone more 
experienced can refine it.

> No... it doesn't work...
> 
> 
>>>> -----Mensaje original-----
>>>> De: Diego Zuccato <diego.zuccato at unibo.it>
>>>> Enviado el: jueves, 10 de junio de 2021 10:37
>>>> Para: Slurm User Community List <slurm-users at lists.schedmd.com>; Gestió
>>>> Servidors <sysadmin.caos at uab.cat>
>>>> Asunto: Re: [slurm-users] Job requesting two different GPUs on two different
>>>> nodes
>>>>
>>>> Il 08/06/2021 15:55, Gestió Servidors ha scritto:
>>>>
>>>> Have you tried defining it as heterogeneus job?
>>>> https://slurm.schedmd.com/heterogeneous_jobs.html
>>>>
>>>> #SBATCH hetjob
>>>> for new SLURM versions or
>>>> #SBATCH packjob
>>>> for older ones
>>>>
>>>> HIH,
>>>>   Diego
>>>>
>>>>> Hi,
>>>>>
>>>>> Today, doing some tests, I have not got a solution to write a submit
>>>>> script that requests 2 different GPUs on 2 different nodes. With this
>>>>> simple script:
>>>>>
>>>>> #!/bin/bash
>>>>> #
>>>>> #SBATCH --job-name=N2n4
>>>>> #SBATCH --output=N2n4-CUDA.txt
>>>>> #SBATCH --gres=gpu:GeForceRTX3080:1
>>>>>
>>>>> #SBATCH -N 2 # number of nodes
>>>>> #SBATCH -n 4 # number of cores
>>>>> #SBATCH --partition=cuda.q
>>>>>
>>>>> module load cuda/11.2
>>>>>
>>>>> sleep 100
>>>>> mpirun /home/caos/druiz/samples-SLURM/OpenMPI/mpihello-3.0.0
>>>>>
>>>>> I get a parallel MPI job in two nodes, two process in each node and
>>>>> one
>>>>> GeForceRTX3080 in each node. However, if I want to request 2 different
>>>>> GPUs, I can't write "#SBATCH --gres=gpu:GeForceRTX3080:1,
>>>>> --gres=gpu:GeForceRTX2070:1" because line "#SBATCH --gres=" is for
>>>>> each node and, then, a line containing two "gres" would request a node
>>>>> with 2 different GPUs. So. is it possible to request 2 different GPUs
>>>>> in 2 different nodes?
>>>>>
>>>>> Thanks.
>>>>>
>>>>
>>>>
>>>> --
>>>> Diego Zuccato
>>>> DIFA - Dip. di Fisica e Astronomia
>>>> Servizi Informatici
>>>> Alma Mater Studiorum - Università di Bologna V.le Berti-Pichat 6/2 - 40127
>>>> Bologna - Italy
>>>> tel.: +39 051 20 95786


-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786



More information about the slurm-users mailing list