Hi Fellow Slurm Users,
This question is not slurm-specific, but it might develop into that.
My question relates to understanding how *typical* HPCs are designed in terms of networking. To start, is it typical for there to be a high speed Ethernet *and* Infiniband networks (meaning separate switches, NICs)? I know you can easily setup IP over IB, but is IB usually fully reserved for MPI messages? I’m tempted to spec all new HPCs with only a high speed (200Gbps) IB network, and use IPoIB for all slurm comms with compute nodes. I plan on using BeeGFS for the file system with RDMA.
Just looking for some feedback, please. Is this OK? Is there a better way? If yes, please share why it’s better.
Thanks,
Daniel Healy