[slurm-users] Upgrading slurm - can I do it while jobs running?

Will Dennis wdennis at nec-labs.com
Wed May 26 18:23:17 UTC 2021

Hi all,

About to embark on my first Slurm upgrade (building from source now, into a versioned path /opt/slurm/<vernum>/ which is then symlinked to /opt/slurm/current/ for the “in-use” one…) This is a new cluster, running 20.11.5 (which we now know has a CVE that was fixed in 20.11.7) but I have researchers running jobs on it currently. As I’m still building out the cluster, I found today that all Slurm source tarballs before 20.11.7 were withdrawn by SchedMD. So, need to upgrade at least the -ctld and -dbd nodes before I can roll any new nodes out on 20.11.7…

As I have at least one researcher that is running some long multi-day jobs, can I down the -dbd and -ctld nodes and upgrade them, then put them back online running the new (latest) release, without munging the jobs on the running worker nodes?

