[slurm-users] Need help with running multiple instances/executions of a batch script in parallel (with NVIDIA HGX A100 GPU as a Gres)
Diego Zuccato
diego.zuccato at unibo.it
Tue Jan 23 08:59:48 UTC 2024
Also, remembre to specify the memory used by the job if you treat it as
a TRES if you're using CR_*Memory to select resources.
Diego
Il 18/01/2024 15:44, Ümit Seren ha scritto:
> This line also has tobe changed:
>
>
> #SBATCH --gpus-per-node=4#SBATCH --gpus-per-node=1
>
> --gpus-per-nodeseems to be the new parameter that is replacing the
> --gres= one, so you can remove the –gres line completely.
>
> Best
>
> Ümit
>
> *From: *slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of
> Kherfani, Hafedh (Professional Services, TC) <hafedh.kherfani at hpe.com>
> *Date: *Thursday, 18. January 2024 at 15:40
> *To: *Slurm User Community List <slurm-users at lists.schedmd.com>
> *Subject: *Re: [slurm-users] Need help with running multiple
> instances/executions of a batch script in parallel (with NVIDIA HGX A100
> GPU as a Gres)
>
> Hi Noam and Matthias,
>
> Thanks both for your answers.
>
> I changed the “#SBATCH --gres=gpu:4“ directive (in the batch script)
> with “#SBATCH --gres=gpu:1“ as you suggested, but it didn’t make a
> difference, as running this batch script 3 times will result in the
> first job to be in a running state, while the second and third jobs will
> still be in a pending state …
>
> [slurmtest at c-a100-master test-batch-scripts]$ cat gpu-job.sh
>
> #!/bin/bash
>
> #SBATCH --job-name=gpu-job
>
> #SBATCH --partition=gpu
>
> #SBATCH --nodes=1
>
> #SBATCH --gpus-per-node=4
>
> #SBATCH --gres=gpu:1 # <<<< Changed from ‘4’
> to ‘1’
>
> #SBATCH --tasks-per-node=1
>
> #SBATCH --output=gpu_job_output.%j
>
> #SBATCH --error=gpu_job_error.%j
>
> hostname
>
> date
>
> sleep 40
>
> pwd
>
> [slurmtest at c-a100-master test-batch-scripts]$ sbatch gpu-job.sh
>
> Submitted batch job *217*
>
> [slurmtest at c-a100-master test-batch-scripts]$ squeue
>
> JOBID PARTITION NAME USER ST TIME NODES
> NODELIST(REASON)
>
> 217 gpu gpu-job slurmtes R 0:02 1
> c-a100-cn01
>
> [slurmtest at c-a100-master test-batch-scripts]$ sbatch gpu-job.sh
>
> Submitted batch job *218*
>
> [slurmtest at c-a100-master test-batch-scripts]$ sbatch gpu-job.sh
>
> Submitted batch job *219*
>
> [slurmtest at c-a100-master test-batch-scripts]$ squeue
>
> JOBID PARTITION NAME USER ST TIME NODES
> NODELIST(REASON)
>
> 219 gpu gpu-job slurmtes *PD* 0:00 1
> (Priority)
>
> 218 gpu gpu-job slurmtes *PD* 0:00 1
> (Resources)
>
> 217 gpu gpu-job slurmtes *R* 0:07 1
> c-a100-cn01
>
> Basically I’m seeking for some help/hints on how to tell Slurm, from the
> batch script for example: “I want only 1 or 2 GPUs to be used/consumed
> by the job”, and then I run the batch script/job a couple of times with
> sbatch command, and confirm that we can indeed have multiple jobs using
> a GPU and running in parallel, at the same time.
>
> Makes sense ?
>
> Best regards,
>
> **
>
> *Hafedh *
>
> *From:*slurm-users <slurm-users-bounces at lists.schedmd.com> *On Behalf Of
> *Bernstein, Noam CIV USN NRL (6393) Washington DC (USA)
> *Sent:* jeudi 18 janvier 2024 2:30 PM
> *To:* Slurm User Community List <slurm-users at lists.schedmd.com>
> *Subject:* Re: [slurm-users] Need help with running multiple
> instances/executions of a batch script in parallel (with NVIDIA HGX A100
> GPU as a Gres)
>
> On Jan 18, 2024, at 7:31 AM, Matthias Loose <m.loose at mindcode.de
> <mailto:m.loose at mindcode.de>> wrote:
>
> Hi Hafedh,
>
> Im no expert in the GPU side of SLURM, but looking at you current
> configuration to me its working as intended at the moment. You have
> defined 4 GPUs and start multiple jobs each consuming 4 GPUs each.
> So the jobs wait for the ressource the be free again.
>
> I think what you need to look into is the MPS plugin, which seems to
> do what you are trying to achieve:
> https://slurm.schedmd.com/gres.html#MPS_Management
> <https://slurm.schedmd.com/gres.html#MPS_Management>
>
> I agree with the first paragraph. How many GPUs are you expecting each
> job to use? I'd have assumed, based on the original text, that each job
> is supposed to use 1 GPU, and the 4 jobs were supposed to be running
> side-by-side on the one node you have (with 4 GPUs). If so, you need to
> tell each job to request only 1 GPU, and currently each one is requesting 4.
>
> If your jobs are actually supposed to be using 4 GPUs each, I still
> don't see any advantage to MPS (at least in what is my usual GPU usage
> pattern): all the jobs will take longer to finish, because they are
> sharing the fixed resource. If they take turns, at least the first ones
> finish as fast as they can, and the last one will finish no later than
> it would have if they were all time-sharing the GPUs. I guess NVIDIA
> had something in mind when they developed MPS, so I guess our pattern
> may not be typical (or at least not universal), and in that case the MPS
> plugin may well be what you need.
>
--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786
More information about the slurm-users
mailing list