[slurm-users] Slurm does not start after (stupid) upgrade from 16.05.9 to 20.11.7

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Wed Aug 25 09:13:41 UTC 2021

On 8/25/21 10:48 AM, Julien Tailleur wrote:
> We have been running a computing cluster using slurm since 2016, that I 
> installed back then, with some help from others. I was pretty late on 
> upgrades and decided to upgrade the cluster up to debian Bullseye, which 
> runs slurm 20.11.7, starting from stretch, that runs slurm 16.05.9.

SchedMD documents that upgrades must be at most 2 major versions, see 
https://slurm.schedmd.com/quickstart_admin.html#upgrade.  So you would 
have to go through 16.05 -> 17.02 -> 18.08 -> 20.02 -> 20.11 (soon 21.08 
will be out).  Whether you can find Debian packages for these old versions 
is unknown to me.

I have collected some Slurm upgrading information in
It's written for CentOS, but the Slurm parts would be the same.

> While the update of the system in itself went smoothly, slurm is broken. 
> Of course, that's the stage at which I thought "Oh, I should have checked 
> if the upgrade is supposed to be harmless"... Now that's the self-bashing 
> is rightfully done, I would be very happy with some help! I hesitate 
> between two strategies: removing slurm completely and a completely new 
> installation, or trying to save what can be saved... I am tempted by the 
> former since I remember suffering a bit to get the installation right in 
> the first place...

A useable database dump from the old 16.05 is vital!  You could start 
again with Slurm 16.05 and upgrade in 4 steps as indicated above.

Beware of potential database issues:

If the 4-step upgrade doesn't work, starting from scratch seems to be the 
only option :-(  My Slurm Wiki page may perhaps be of a little help: 


More information about the slurm-users mailing list