[slurm-users] Is it possible to define multiple partitions for the same node, but each one having a different subset of GPUs?

Cristóbal Navarro cristobal.navarro.g at gmail.com
Wed Mar 31 15:21:06 UTC 2021


Hi Community,
I was checking the documentation but could find clear information on what I
am trying to do.
Here at the university we have a large compute node with 3 classes of GPUs.
Lets say the node's hostname is "gpuComputer", it is composed of:

   - 4x large GPUs
   - 4x medium GPUs (MIG devices)
   - 16x small GPUs (Mig devices)

Our plan is that we want to have one partition for each class of GPUs.
So if a user chooses the "small" partition, it will only see up to 16x
small GPUs, and would not interfere with other jobs running on the "medium"
or "large" partitions.

Can I create three partitions and specify the corresponding subset of GPUs
for each one?

If not, would NodeName and NodeHostname serve as an alternative way? i.e.,
to specify the node three times with different NodeName, but all using the
same Hostname=gpuComputer, and specifying the corresponding subset of
"Gres" resources for each one. Then on each partition, to choose the
corresponding NodeName.

Any feedback or advice on the best way to accomplish this would be much
appreciated.
best regards



-- 
Cristóbal A. Navarro
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210331/90020280/attachment.htm>


More information about the slurm-users mailing list