[slurm-users] schedule mixed nodes first

Durai Arasan arasan.durai at gmail.com
Fri May 14 21:52:47 UTC 2021


Frequently all of our GPU nodes (8xGPU each) are in MIXED state and there
is no IDLE node. Some jobs require a complete node (all 8 GPUs) and such
jobs therefore have to wait really long before they can run.

Is there a way of improving this situation? E.g. by not blocking IDLE nodes
with jobs that only use a fraction of the 8 GPUs? Why are single GPU jobs
not scheduled to fill already MIXED nodes before using IDLE ones?

What parameters/configuration need to be adjusted for this to be enforced?

Our current scheduling configuration:


gres.conf (one node example):
NodeName=gpu-6 Name=gpu Type=rtx2080ti File=/dev/nvidia[0-3]
NodeName=gpu-6 Name=gpu Type=rtx2080ti File=/dev/nvidia[4-7]

Thank you,
Competence center for Machine Learning Tübingen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210514/ed934f57/attachment.htm>

More information about the slurm-users mailing list