[slurm-users] Job allocation from a heterogenous pool of nodes

Wed Dec 7 08:42:14 UTC 2022

Dear slurm community,

I am encountering a unique situation where I need to allocate jobs to nodes
with different numbers of CPU cores. For instance:

node01:  Xeon 6226 32 cores

node02:  EPYC 7543 64 cores

$ salloc
--partition=all --nodes=2 --nodelist=gpu01,gpu02 --ntasks-per-node=32
--comment=etc

If --ntasks-per-node is larger than 32, the job could not be allocated
since node01 has only 32 cores.

In the context of NVIDIA's HPL container, we need to pin MPI
processes according to NUMA affinity for best performance.

For HGX-1, there are 8 A100s having affinity with 1st, 3rd, 5th, and 7th
NUMA domain, respectively.

With --ntasks-per-node=32, only the first half of EPYC's NUMA domain is
available, and we had to assign the 4-7th A100 to 0th and 2nd NUMA domain,
leading to some performance degradation.

I am looking for a way to request more tasks than the number of physically
available cores, i.e.

$ salloc --partition=all --nodes=2 --nodelist=gpu01,gpu02
 --ntasks-per-node=64 --comment=etc

Your suggestions are much appreciated.

Regards,

Viet-Duc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20221207/1c039d45/attachment.htm>