[slurm-users] schedule mixed nodes first
arasan.durai at gmail.com
Fri May 14 21:52:47 UTC 2021
Frequently all of our GPU nodes (8xGPU each) are in MIXED state and there
is no IDLE node. Some jobs require a complete node (all 8 GPUs) and such
jobs therefore have to wait really long before they can run.
Is there a way of improving this situation? E.g. by not blocking IDLE nodes
with jobs that only use a fraction of the 8 GPUs? Why are single GPU jobs
not scheduled to fill already MIXED nodes before using IDLE ones?
What parameters/configuration need to be adjusted for this to be enforced?
Our current scheduling configuration:
gres.conf (one node example):
NodeName=gpu-6 Name=gpu Type=rtx2080ti File=/dev/nvidia[0-3]
NodeName=gpu-6 Name=gpu Type=rtx2080ti File=/dev/nvidia[4-7]
Competence center for Machine Learning Tübingen
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users