25 Feb
2024
25 Feb
'24
8:12 p.m.
Hi Fellow Slurm Users, This question is not slurm-specific, but it might develop into that. My question relates to understanding how *typical* HPCs are designed in terms of networking. To start, is it typical for there to be a high speed Ethernet *and* Infiniband networks (meaning separate switches, NICs)? I know you can easily setup IP over IB, but is IB usually fully reserved for MPI messages? I’m tempted to spec all new HPCs with only a high speed (200Gbps) IB network, and use IPoIB for all slurm comms with compute nodes. I plan on using BeeGFS for the file system with RDMA. Just looking for some feedback, please. Is this OK? Is there a better way? If yes, please share why it’s better. Thanks, Daniel Healy