[slurm-users] How to apply for multiple GPU cards from different worker nodes?

Ran Du bella.ran.du at gmail.com
Mon Apr 15 12:25:34 UTC 2019


Dear Antony,

       Thanks a lot for your reply, I tried to submit a job with your
advice, and no more sbatch errors.

       But because our cluster is under maintenance, I have to wait till
tomorrow to see if GPU cards are allocated correctly.  I will let you know
as soon as the job is submitted successfully.

       Thanks a lot for your kind help.

Best regards,
Ran

On Mon, Apr 15, 2019 at 4:40 PM Antony Cleave <antony.cleave at gmail.com>
wrote:

> Ask for 8 gpus on 2 nodes instead.
>
> In your script just change the 16 to 8 and it should do what you want.
>
> You are currently asking for 2 nodes with 16 gpu each as Gres resources
> are per node.
>
> Antony
>
> On Mon, 15 Apr 2019, 09:08 Ran Du, <bella.ran.du at gmail.com> wrote:
>
>> Dear all,
>>
>>      Does anyone know how to set #SBATCH options to get multiple GPU
>> cards from different worker nodes?
>>
>>      One of our users would like to apply for 16 NVIDIA V100 cards for
>> his job, and  there are 8 GPU cards on each worker node, I have tried the
>> following #SBATCH options:
>>
>>       #SBATCH --partition=gpu
>>       #SBATCH --qos=normal
>>       #SBATCH --account=u07
>>       #SBATCH --job-name=cross
>>       #SBATCH --nodes=2
>>       #SBATCH --mem-per-cpu=1024
>>       #SBATCH --output=test.32^4.16gpu.log
>>       #SBATCH --gres=gpu:v100:16
>>
>>       but got the sbatch error message :
>>       sbatch: error: Batch job submission failed: Requested node
>> configuration is not available
>>
>>       And I found a similar question on stack overflow:
>>
>> https://stackoverflow.com/questions/45200926/how-to-access-to-gpus-on-different-nodes-in-a-cluster-with-slurm
>>
>>       And it is said that multiple GPU cards allocation on different
>> worker nodes are not available, the post is in 2017, is it still true at
>> present?
>>
>>       Thanks a lot for your help.
>>
>> Best regards,
>> Ran
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190415/79ea4918/attachment.html>


More information about the slurm-users mailing list