[slurm-users] How to apply for multiple GPU cards from different worker nodes?
Marcus Wagner
wagner at itc.rwth-aachen.de
Tue Apr 16 09:38:38 UTC 2019
Dear Ran,
you can only ask for GPUS PER NODE, as gres are ressources per node.
So, you can ask for 5 gpus and then get 5 gpus on each of the two nodes.
At the moment it is not possible to ask for 8 gpus on one node and 2 on
another.
That MIGHT change with slurm 19.05, since SchedMD is overhauling besides
pother things the gpu handling within slurm.
Best
Marcus
On 4/16/19 9:15 AM, Ran Du wrote:
> Dear Antony,
>
> It's worked!
>
> I checked the allocation, and here is the record:
>
> Nodes=gpu012 CPU_IDs=0-2 Mem=3072 GRES_IDX=gpu:v100(IDX:0-7)
> Nodes=gpu013 CPU_IDs=0 Mem=1024 GRES_IDX=gpu:v100(IDX:0-7)
>
> The job has got what it applied for.
>
> And another question is : how to apply for multiple cards could
> not be divided exactly by 8? For example, to apply for 10 GPU cards, 8
> cards on one node and 2 cards on another node?
>
> Thanks a lot again for your kind help.
>
> Best regards,
> Ran
>
> On Mon, Apr 15, 2019 at 8:25 PM Ran Du <bella.ran.du at gmail.com
> <mailto:bella.ran.du at gmail.com>> wrote:
>
> Dear Antony,
>
> Thanks a lot for your reply, I tried to submit a job with
> your advice, and no more sbatch errors.
>
> But because our cluster is under maintenance, I have to
> wait till tomorrow to see if GPU cards are allocated correctly. I
> will let you know as soon as the job is submitted successfully.
>
> Thanks a lot for your kind help.
>
> Best regards,
> Ran
>
> On Mon, Apr 15, 2019 at 4:40 PM Antony Cleave
> <antony.cleave at gmail.com <mailto:antony.cleave at gmail.com>> wrote:
>
> Ask for 8 gpus on 2 nodes instead.
>
> In your script just change the 16 to 8 and it should do
> what you want.
>
> You are currently asking for 2 nodes with 16 gpu each as Gres
> resources are per node.
>
> Antony
>
> On Mon, 15 Apr 2019, 09:08 Ran Du, <bella.ran.du at gmail.com
> <mailto:bella.ran.du at gmail.com>> wrote:
>
> Dear all,
>
> Does anyone know how to set #SBATCH options to get
> multiple GPU cards from different worker nodes?
>
> One of our users would like to apply for 16 NVIDIA
> V100 cards for his job, and there are 8 GPU cards on each
> worker node, I have tried the following #SBATCH options:
>
> #SBATCH --partition=gpu
> #SBATCH --qos=normal
> #SBATCH --account=u07
> #SBATCH --job-name=cross
> #SBATCH --nodes=2
> #SBATCH --mem-per-cpu=1024
> #SBATCH --output=test.32^4.16gpu.log
> #SBATCH --gres=gpu:v100:16
>
> but got the sbatch error message :
> sbatch: error: Batch job submission failed:
> Requested node configuration is not available
>
> And I found a similar question on stack overflow:
> https://stackoverflow.com/questions/45200926/how-to-access-to-gpus-on-different-nodes-in-a-cluster-with-slurm
>
> And it is said that multiple GPU cards allocation on
> different worker nodes are not available, the post is in
> 2017, is it still true at present?
>
> Thanks a lot for your help.
>
> Best regards,
> Ran
>
--
Marcus Wagner, Dipl.-Inf.
IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner at itc.rwth-aachen.de
www.itc.rwth-aachen.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190416/61e80902/attachment.html>
More information about the slurm-users
mailing list