[slurm-users] [External] slurmd -C vs lscpu - which do I use to populate slurm.conf?
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Thu Apr 29 06:02:46 UTC 2021
On 4/29/21 1:06 AM, Michael Robbert wrote:
> I think that you want to use the output of slurmd -C, but if that isn’t
> telling you the truth then you may not have built slurm with the correct
> libraries. I believe that you need to build with hwloc in order to get the
> most accurate details of the CPU topology. Make sure you have hwloc-devel
> installed and try to rebuild Slurm.
Slurm has many prerequisites for building, see what I believe is the full
list here:
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#install-prerequisites
If you have recent Xeon or EPYC CPUs, you should also consider the number
of NUMA domains per socket. Use numactl -H to see what you've got.
Regarding slurmd -C and multiple NUMA domain per socket, there's a small
bug being sorted out in https://bugs.schedmd.com/show_bug.cgi?id=11434
It may be beneficial for HPC applications to enable Sub NUMA Cluster (SNC)
in BIOS, see
https://www.dell.com/support/kbdoc/da-dk/000176921/bios-characterization-for-hpc-with-intel-cascade-lake-processors
/Ole
More information about the slurm-users
mailing list