[slurm-users] Usage of particular GPU out of 4 GPUs while submitting jobs to DGX Server

Ravi Konila ravibhatk at gmail.com
Mon Nov 20 04:36:42 UTC 2023


Hello Everyone

I am just beginner of slurm and started to use the same on our DGX Server which has 4 numbers of A100, 80GB GPUs.
Everything works fine, jobs goes to random GPUs (free available).
My question is related to submission of jobs to those GPUs. How do a student submit the job to a particular GPU out of 4 GPUs? For example, studentA should submit the job to GPU ID 1 instead of GPU ID 0. 

Also we are planning for MIG in the server and we would like few students to submit the jobs to 20G partition and non critical jobs to 5G partition. 
How should be the slurm.conf and gres.conf in this case. 

Currently our configuration is as below:

gres.conf
Name=gpu    type=A100    file=/dev/nvidia[0-2,4]

------------
slurm.conf
.
.
.
GresTypes=gpu
NodeName=rl-dgxs-r21-l2 Gres=gpu:A100:4 CPUs=128 RealMemory=500000 State=UNKNOWN
PartitionName=LocalGPUQ Nodes=ALL Default=YES MaxTime=INFINITE State=UP

-------------

Any suggestions or help in this regard is highly appreciated. 

With Warm Regards
Ravi Konila
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20231120/018a699b/attachment.htm>


More information about the slurm-users mailing list