Send slurm-users mailing list submissions to
slurm-users@lists.schedmd.com
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.schedmd.com/cgi-bin/mailman/listinfo/slurm-users
or, via email, send a message with subject or body 'help' to
slurm-users-request@lists.schedmd.com
You can reach the person managing the list at
slurm-users-owner@lists.schedmd.com
When replying, please edit your Subject line so it is more specific
than "Re: Contents of slurm-users digest..."
Today's Topics:
1. Re: Need help with running multiple instances/executions of a
batch script in parallel (with NVIDIA HGX A100 GPU as a Gres)
(Marko Markoc)
2. Re: Need help with running multiple instances/executions of a
batch script in parallel (with NVIDIA HGX A100 GPU as a Gres)
(?mit Seren)
----------------------------------------------------------------------
Message: 1
Date: Fri, 19 Jan 2024 06:12:24 -0800
From: Marko Markoc <mmarkoc@pdx.edu>
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] Need help with running multiple
instances/executions of a batch script in parallel (with NVIDIA HGX
A100 GPU as a Gres)
Message-ID:
<CABnuMe4JTA0e6=VbO8D+To=8FGO+3Byv1dK_MC+OuRitzN5dXg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
+1 on checking the memory allocation.
Or add/check if you have any DefMemPerX set in your slurm.conf
On Fri, Jan 19, 2024 at 12:33?AM mohammed shambakey <shambakey1@gmail.com>
wrote:
> Hi
>
> I'm not an expert, but is it possible that the currently running jobs is
> consuming the whole node because it is allocated the whole memory of the
> node (so the other 2 jobs had to wait until it finishes)?
> Maybe if you try to restrict the required memory for each job?
>
> Regards
>
> On Thu, Jan 18, 2024 at 4:46?PM ?mit Seren <uemit.seren@gmail.com> wrote:
>
>> This line also has tobe changed:
>>
>>
>> #SBATCH --gpus-per-node=4 ? #SBATCH --gpus-per-node=1
>>
>> --gpus-per-node seems to be the new parameter that is replacing the --gres=
>> one, so you can remove the ?gres line completely.
>>
>>
>>
>> Best
>>
>> ?mit
>>
>>
>>
>> *From: *slurm-users <slurm-users-bounces@lists.schedmd.com> on behalf of
>> Kherfani, Hafedh (Professional Services, TC) <hafedh.kherfani@hpe.com>
>> *Date: *Thursday, 18. January 2024 at 15:40
>> *To: *Slurm User Community List <slurm-users@lists.schedmd.com>
>> *Subject: *Re: [slurm-users] Need help with running multiple
>> instances/executions of a batch script in parallel (with NVIDIA HGX A100
>> GPU as a Gres)
>>
>> Hi Noam and Matthias,
>>
>>
>>
>> Thanks both for your answers.
>>
>>
>>
>> I changed the ?#SBATCH --gres=gpu:4? directive (in the batch script) with
>> ?#SBATCH --gres=gpu:1? as you suggested, but it didn?t make a difference,
>> as running this batch script 3 times will result in the first job to be in
>> a running state, while the second and third jobs will still be in a pending
>> state ?
>>
>>
>>
>> [slurmtest@c-a100-master test-batch-scripts]$ cat gpu-job.sh
>>
>> #!/bin/bash
>>
>> #SBATCH --job-name=gpu-job
>>
>> #SBATCH --partition=gpu
>>
>> #SBATCH --nodes=1
>>
>> #SBATCH --gpus-per-node=4
>>
>> #SBATCH --gres=gpu:1 # <<<< Changed from ?4?
>> to ?1?
>>
>> #SBATCH --tasks-per-node=1
>>
>> #SBATCH --output=gpu_job_output.%j
>>
>> #SBATCH --error=gpu_job_error.%j
>>
>>
>>
>> hostname
>>
>> date
>>
>> sleep 40
>>
>> pwd
>>
>>
>>
>> [slurmtest@c-a100-master test-batch-scripts]$ sbatch gpu-job.sh
>>
>> Submitted batch job *217*
>>
>> [slurmtest@c-a100-master test-batch-scripts]$ squeue
>>
>> JOBID PARTITION NAME USER ST TIME NODES
>> NODELIST(REASON)
>>
>> 217 gpu gpu-job slurmtes R 0:02 1
>> c-a100-cn01
>>
>> [slurmtest@c-a100-master test-batch-scripts]$ sbatch gpu-job.sh
>>
>> Submitted batch job *218*
>>
>> [slurmtest@c-a100-master test-batch-scripts]$ sbatch gpu-job.sh
>>
>> Submitted batch job *219*
>>
>> [slurmtest@c-a100-master test-batch-scripts]$ squeue
>>
>> JOBID PARTITION NAME USER ST TIME NODES
>> NODELIST(REASON)
>>
>> 219 gpu gpu-job slurmtes *PD* 0:00 1
>> (Priority)
>>
>> 218 gpu gpu-job slurmtes *PD* 0:00 1
>> (Resources)
>>
>> 217 gpu gpu-job slurmtes *R* 0:07 1
>> c-a100-cn01
>>
>>
>>
>> Basically I?m seeking for some help/hints on how to tell Slurm, from the
>> batch script for example: ?I want only 1 or 2 GPUs to be used/consumed by
>> the job?, and then I run the batch script/job a couple of times with sbatch
>> command, and confirm that we can indeed have multiple jobs using a GPU and
>> running in parallel, at the same time.
>>
>>
>>
>> Makes sense ?
>>
>>
>>
>>
>>
>> Best regards,
>>
>>
>>
>> *Hafedh *
>>
>>
>>
>> *From:* slurm-users <slurm-users-bounces@lists.schedmd.com> *On Behalf
>> Of *Bernstein, Noam CIV USN NRL (6393) Washington DC (USA)
>> *Sent:* jeudi 18 janvier 2024 2:30 PM
>> *To:* Slurm User Community List <slurm-users@lists.schedmd.com>
>> *Subject:* Re: [slurm-users] Need help with running multiple
>> instances/executions of a batch script in parallel (with NVIDIA HGX A100
>> GPU as a Gres)
>>
>>
>>
>> On Jan 18, 2024, at 7:31 AM, Matthias Loose <m.loose@mindcode.de> wrote:
>>
>>
>>
>> Hi Hafedh,
>>
>> Im no expert in the GPU side of SLURM, but looking at you current
>> configuration to me its working as intended at the moment. You have defined
>> 4 GPUs and start multiple jobs each consuming 4 GPUs each. So the jobs wait
>> for the ressource the be free again.
>>
>> I think what you need to look into is the MPS plugin, which seems to do
>> what you are trying to achieve:
>>
https://slurm.schedmd.com/gres.html#MPS_Management
>>
>>
>>
>> I agree with the first paragraph. How many GPUs are you expecting each
>> job to use? I'd have assumed, based on the original text, that each job is
>> supposed to use 1 GPU, and the 4 jobs were supposed to be running
>> side-by-side on the one node you have (with 4 GPUs). If so, you need to
>> tell each job to request only 1 GPU, and currently each one is requesting 4.
>>
>>
>>
>> If your jobs are actually supposed to be using 4 GPUs each, I still don't
>> see any advantage to MPS (at least in what is my usual GPU usage pattern):
>> all the jobs will take longer to finish, because they are sharing the fixed
>> resource. If they take turns, at least the first ones finish as fast as
>> they can, and the last one will finish no later than it would have if they
>> were all time-sharing the GPUs. I guess NVIDIA had something in mind when
>> they developed MPS, so I guess our pattern may not be typical (or at least
>> not universal), and in that case the MPS plugin may well be what you need.
>>
>
>
> --
> Mohammed
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://lists.schedmd.com/pipermail/slurm-users/attachments/20240119/7a49a9e7/attachment-0001.htm>
------------------------------
Message: 2
Date: Fri, 19 Jan 2024 15:24:17 +0100
From: ?mit Seren <uemit.seren@gmail.com>
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] Need help with running multiple
instances/executions of a batch script in parallel (with NVIDIA HGX
A100 GPU as a Gres)
Message-ID:
<CANBYW4ACFtNwawVc8WqcGXgBOAq6v_eeHHX9mXGdgbUs_D=EyA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Maybe also post the output of scontrol show job <jobid> to check the other
resources allocated for the job.
On Thu, Jan 18, 2024, 19:22 Kherfani, Hafedh (Professional Services, TC) <
hafedh.kherfani@hpe.com> wrote:
> Hi ?mit, Troy,
>
>
>
> I removed the line ?#SBATCH --gres=gpu:1?, and changed the sbatch
> directive ?--gpus-per-node=4? to ?--gpus-per-node=1?, but still getting the
> same result: When running multiple sbatch commands for the same script,
> only one job (first execution) is running, and all subsequent jobs are in a
> pending state (REASON being reported as ?Resources? for immediately next
> job in the queue, and ?Priority? for remaining ones) ?
>
>
>
> As for the output from ?scontrol show job <jobid>? command: I don?t see a ?TRES?
> field on its own .. I see the field ?TresPerNode=gres/gpu:1? (the value in
> the end f the line will correspond to the value specified in the ?--gpus-per-node=?
> directive.
>
>
>
> PS: Is it normal/expected (in the output of scontrol show job command) to
> have ?Features=(null)? ? I was expecting to see Features=gpu ?.
>
>
>
>
>
> Best regards,
>
>
>
> *Hafedh *
>
>
>
> *From:* slurm-users <slurm-users-bounces@lists.schedmd.com> *On Behalf Of
> *Baer, Troy
> *Sent:* jeudi 18 janvier 2024 3:47 PM
> *To:* Slurm User Community List <slurm-users@lists.schedmd.com>
> *Subject:* Re: [slurm-users] Need help with running multiple
> instances/executions of a batch script in parallel (with NVIDIA HGX A100
> GPU as a Gres)
>
>
>
> Hi Hafedh,
>
>
>
> Your job script has the sbatch directive ??gpus-per-node=4? set. I
> suspect that if you look at what?s allocated to the running job by doing
> ?scontrol show job <jobid>? and looking at the TRES field, it?s been
> allocated 4 GPUs instead of one.
>
>
>
> Regards,
>
> --Troy
>
>
>
> *From:* slurm-users <slurm-users-bounces@lists.schedmd.com> *On Behalf Of
> *Kherfani, Hafedh (Professional Services, TC)
> *Sent:* Thursday, January 18, 2024 9:38 AM
> *To:* Slurm User Community List <slurm-users@lists.schedmd.com>
> *Subject:* Re: [slurm-users] Need help with running multiple
> instances/executions of a batch script in parallel (with NVIDIA HGX A100
> GPU as a Gres)
>
>
>
> Hi Noam and Matthias, Thanks both for your answers. I changed the ?#SBATCH
> --gres=gpu: 4? directive (in the batch script) with ?#SBATCH --gres=gpu: 1?
> as you suggested, but it didn?t make a difference, as running
>
>
>
> Hi Noam and Matthias,
>
>
>
> Thanks both for your answers.
>
>
>
> I changed the ?#SBATCH --gres=gpu:4? directive (in the batch script) with
> ?#SBATCH --gres=gpu:1? as you suggested, but it didn?t make a difference,
> as running this batch script 3 times will result in the first job to be in
> a running state, while the second and third jobs will still be in a pending
> state ?
>
>
>
> [slurmtest@c-a100-master test-batch-scripts]$ cat gpu-job.sh
>
> #!/bin/bash
>
> #SBATCH --job-name=gpu-job
>
> #SBATCH --partition=gpu
>
> #SBATCH --nodes=1
>
> #SBATCH --gpus-per-node=4
>
> #SBATCH --gres=gpu:1 # <<<< Changed from ?4? to
> ?1?
>
> #SBATCH --tasks-per-node=1
>
> #SBATCH --output=gpu_job_output.%j
>
> #SBATCH --error=gpu_job_error.%j
>
>
>
> hostname
>
> date
>
> sleep 40
>
> pwd
>
>
>
> [slurmtest@c-a100-master test-batch-scripts]$ sbatch gpu-job.sh
>
> Submitted batch job *217*
>
> [slurmtest@c-a100-master test-batch-scripts]$ squeue
>
> JOBID PARTITION NAME USER ST TIME NODES
> NODELIST(REASON)
>
> 217 gpu gpu-job slurmtes R 0:02 1
> c-a100-cn01
>
> [slurmtest@c-a100-master test-batch-scripts]$ sbatch gpu-job.sh
>
> Submitted batch job *218*
>
> [slurmtest@c-a100-master test-batch-scripts]$ sbatch gpu-job.sh
>
> Submitted batch job *219*
>
> [slurmtest@c-a100-master test-batch-scripts]$ squeue
>
> JOBID PARTITION NAME USER ST TIME NODES
> NODELIST(REASON)
>
> 219 gpu gpu-job slurmtes *PD* 0:00 1
> (Priority)
>
> 218 gpu gpu-job slurmtes *PD* 0:00 1
> (Resources)
>
> 217 gpu gpu-job slurmtes *R* 0:07 1
> c-a100-cn01
>
>
>
> Basically I?m seeking for some help/hints on how to tell Slurm, from the
> batch script for example: ?I want only 1 or 2 GPUs to be used/consumed by
> the job?, and then I run the batch script/job a couple of times with sbatch
> command, and confirm that we can indeed have multiple jobs using a GPU and
> running in parallel, at the same time.
>
>
>
> Makes sense ?
>
>
>
>
>
> Best regards,
>
>
>
> *Hafedh *
>
>
>
> *From:* slurm-users <slurm-users-bounces@lists.schedmd.com> *On Behalf Of
> *Bernstein, Noam CIV USN NRL (6393) Washington DC (USA)
> *Sent:* jeudi 18 janvier 2024 2:30 PM
> *To:* Slurm User Community List <slurm-users@lists.schedmd.com>
> *Subject:* Re: [slurm-users] Need help with running multiple
> instances/executions of a batch script in parallel (with NVIDIA HGX A100
> GPU as a Gres)
>
>
>
> On Jan 18, 2024, at 7:31 AM, Matthias Loose <m.loose@mindcode.de> wrote:
>
>
>
> Hi Hafedh,
>
> Im no expert in the GPU side of SLURM, but looking at you current
> configuration to me its working as intended at the moment. You have defined
> 4 GPUs and start multiple jobs each consuming 4 GPUs each. So the jobs wait
> for the ressource the be free again.
>
> I think what you need to look into is the MPS plugin, which seems to do
> what you are trying to achieve:
>
https://slurm.schedmd.com/gres.html#MPS_Management
>
>
>
> I agree with the first paragraph. How many GPUs are you expecting each
> job to use? I'd have assumed, based on the original text, that each job is
> supposed to use 1 GPU, and the 4 jobs were supposed to be running
> side-by-side on the one node you have (with 4 GPUs). If so, you need to
> tell each job to request only 1 GPU, and currently each one is requesting 4.
>
>
>
> If your jobs are actually supposed to be using 4 GPUs each, I still don't
> see any advantage to MPS (at least in what is my usual GPU usage pattern):
> all the jobs will take longer to finish, because they are sharing the fixed
> resource. If they take turns, at least the first ones finish as fast as
> they can, and the last one will finish no later than it would have if they
> were all time-sharing the GPUs. I guess NVIDIA had something in mind when
> they developed MPS, so I guess our pattern may not be typical (or at least
> not universal), and in that case the MPS plugin may well be what you need.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://lists.schedmd.com/pipermail/slurm-users/attachments/20240119/6968dcff/attachment.htm>
End of slurm-users Digest, Vol 75, Issue 31
*******************************************