[slurm-users] restarting slurmctld restarts jobs???

Diego Zuccato diego.zuccato at unibo.it
Mon Sep 20 09:08:32 UTC 2021


Hello all.

After summer break, I noticed that rebooting one of the two slurmctld 
nodes kills & requeues all running jobs. Before the break it did not 
impact running jobs and nobody changed config during the break... Duh?

Today I just restarted slurmctld and slurmd: same kill&requeue.

I'm currently in the process of adding some nodes, but I already did it 
other times w/ no issues (actually the second slurmctld node have been 
installed to catch the race of a job terminating while the main 
slurmctld was shut down).

Anything I should double-check?

Tks.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786



More information about the slurm-users mailing list