Hi All,
I am currently working on a single Linux machine running Ubuntu 25.10 and am looking to conduct performance testing on the Slurm backfill algorithm. My objective is to simulate a cluster environment consisting of 1,024 nodes.
While researching existing solutions, I found that many modern Slurm-simulator implementations (such as various plugin-based designs) rely heavily on Docker and Docker Compose. However, I would like to avoid using containerization and instead run the simulation natively on my host OS.
I understand that older versions of the Slurm source code included a native sim directory that allowed for simulations without external dependencies like Docker. Given this, I have the following technical questions:
Native Compatibility: Is it still feasible to compile and run a Slurm simulator natively on Ubuntu 25.10?
Version Recommendations: Would you recommend using a specific legacy version of Slurm that still contains the original simulation files, or is there a way to port that functionality to a more recent release?
Configuration: Are there specific build flags or configuration steps required to enable the simulated node environment (1,024 nodes) on a single machine without triggering the overhead of containerized networking?
I would appreciate any guidance or documentation you could provide on achieving a high-node-count simulation in a "bare-metal" Linux environment.
Best regards,