[slurm-users] Restart Job after sudden reboot of the node

Christopher Samuel chris at csamuel.org
Sat Jul 25 01:30:25 UTC 2020

On 7/24/20 12:28 pm, Saikat Roy wrote:

> If SLURM restarts automatically, is there any way to stop it?

If you would rather Slurm not start scheduling jobs when it is restarted 
then you can set your partitions to have `State=DOWN` in slurm.conf.

That way should the node running slurmctld reboot then it won't start 
scheduling jobs until you tell it to.

For compute nodes I believe Slurm should detect any node that reboots 
and mark it "DOWN" with the reason set to "Node unexpectedly rebooted".

All the best,
   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA

More information about the slurm-users mailing list