[slurm-users] GPU jobs not running correctly

Andrey Malyutin malyutinag at gmail.com
Fri Aug 20 00:35:29 UTC 2021


We are in the process of finishing up the setup of a cluster with 3 nodes,
4 GPUs each. One node has RTX3090s and the other 2 have RTX6000s.Any job
asking for 1 GPU in the submission script will wait to run on the 3090
node, no matter resource availability. Same job requesting 2 or more GPUs
will run on any node. I don't even know where to begin troubleshooting this
issue; entries for the 3 nodes are effectively identical in slurm.conf. Any
help would be appreciated. (If helpful - this cluster is used for
structural biology, with cryosparc and relion packages).

Thank you,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210819/10e1b1b7/attachment.htm>

More information about the slurm-users mailing list