[slurm-users] Rolling reboot with at most N machines down simultaneously?
Corentin Mercier
corentin.a.mercier at inria.fr
Fri Aug 5 09:27:00 UTC 2022
Hello,
I think you could use SLURM's power saving mecanism to shut down all your nodes simultaneously.
Then doing srun -N<nb_nodes> -C <your_node_group> true (or any other small work) will wake up N nodes simultaneously.
You can even do srun while your nodes are powering down, SLURM will reboot them as soon as they're powered down.
I hope it will be helpful !
Regards,
C.Mercier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220805/57ea0352/attachment.htm>
More information about the slurm-users
mailing list