[slurm-users] How to apply for multiple GPU cards from different worker nodes?

Wed Apr 17 00:43:25 UTC 2019

Dear Marcus,

      Thanks a lot for your reply. I will write it into our User Manual,
and let users know how to apply for multiple GPU cards.

Best regards,
Ran

On Tue, Apr 16, 2019 at 5:40 PM Marcus Wagner <wagner at itc.rwth-aachen.de>
wrote:

> Dear Ran,
>
> you can only ask for GPUS PER NODE, as gres are ressources per node.
>
> So, you can ask for 5 gpus and then get 5 gpus on each of the two nodes.
> At the moment it is not possible to ask for 8 gpus on one node and 2 on
> another.
> That MIGHT change with slurm 19.05, since SchedMD is overhauling besides
> pother things the gpu handling within slurm.
>
>
> Best
> Marcus
>
> On 4/16/19 9:15 AM, Ran Du wrote:
>
> Dear Antony,
>
>       It's worked!
>
>       I checked the allocation, and here is the record:
>
>       Nodes=gpu012 CPU_IDs=0-2 Mem=3072 GRES_IDX=gpu:v100(IDX:0-7)
> Nodes=gpu013 CPU_IDs=0 Mem=1024 GRES_IDX=gpu:v100(IDX:0-7)
>
>       The job has got what it applied for.
>
>       And another question is : how to apply for multiple cards could not
> be divided exactly by 8？ For example, to apply for 10 GPU cards, 8 cards on
> one node and 2 cards on another node?
>
>      Thanks a lot again for your kind help.
>
> Best regards,
> Ran
>
>
> On Mon, Apr 15, 2019 at 8:25 PM Ran Du <bella.ran.du at gmail.com> wrote:
>
>> Dear Antony,
>>
>>        Thanks a lot for your reply, I tried to submit a job with your
>> advice, and no more sbatch errors.
>>
>>        But because our cluster is under maintenance, I have to wait till
>> tomorrow to see if GPU cards are allocated correctly.  I will let you know
>> as soon as the job is submitted successfully.
>>
>>        Thanks a lot for your kind help.
>>
>> Best regards,
>> Ran
>>
>> On Mon, Apr 15, 2019 at 4:40 PM Antony Cleave <antony.cleave at gmail.com>
>> wrote:
>>
>>> Ask for 8 gpus on 2 nodes instead.
>>>
>>> In your script just change the 16 to 8 and it should do what you want.
>>>
>>> You are currently asking for 2 nodes with 16 gpu each as Gres resources
>>> are per node.
>>>
>>> Antony
>>>
>>> On Mon, 15 Apr 2019, 09:08 Ran Du, <bella.ran.du at gmail.com> wrote:
>>>
>>>> Dear all,
>>>>
>>>>      Does anyone know how to set #SBATCH options to get multiple GPU
>>>> cards from different worker nodes?
>>>>
>>>>      One of our users would like to apply for 16 NVIDIA V100 cards for
>>>> his job, and  there are 8 GPU cards on each worker node, I have tried the
>>>> following #SBATCH options:
>>>>
>>>>       #SBATCH --partition=gpu
>>>>       #SBATCH --qos=normal
>>>>       #SBATCH --account=u07
>>>>       #SBATCH --job-name=cross
>>>>       #SBATCH --nodes=2
>>>>       #SBATCH --mem-per-cpu=1024
>>>>       #SBATCH --output=test.32^4.16gpu.log
>>>>       #SBATCH --gres=gpu:v100:16
>>>>
>>>>       but got the sbatch error message :
>>>>       sbatch: error: Batch job submission failed: Requested node
>>>> configuration is not available
>>>>
>>>>       And I found a similar question on stack overflow:
>>>>
>>>> https://stackoverflow.com/questions/45200926/how-to-access-to-gpus-on-different-nodes-in-a-cluster-with-slurm
>>>>
>>>>       And it is said that multiple GPU cards allocation on different
>>>> worker nodes are not available, the post is in 2017, is it still true at
>>>> present?
>>>>
>>>>       Thanks a lot for your help.
>>>>
>>>> Best regards,
>>>> Ran
>>>>
>>>
> --
> Marcus Wagner, Dipl.-Inf.
>
> IT Center
> Abteilung: Systeme und Betrieb
> RWTH Aachen University
> Seffenter Weg 23
> 52074 Aachen
> Tel: +49 241 80-24383
> Fax: +49 241 80-624383wagner at itc.rwth-aachen.dewww.itc.rwth-aachen.de
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190417/4630032c/attachment-0001.html>