[slurm-users] Rolling reboot with at most N machines down simultaneously?
corentin.a.mercier at inria.fr
Fri Aug 5 09:27:00 UTC 2022
I think you could use SLURM's power saving mecanism to shut down all your nodes simultaneously.
Then doing srun -N<nb_nodes> -C <your_node_group> true (or any other small work) will wake up N nodes simultaneously.
You can even do srun while your nodes are powering down, SLURM will reboot them as soon as they're powered down.
I hope it will be helpful !
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users