[slurm-users] Upgrading slurm - can I do it while jobs running?

Antony Cleave antony.cleave at gmail.com
Wed May 26 18:43:38 UTC 2021

Short answer yes

Its not risk free but as long as you increase all the timeouts to your
worst case estimate x4 and make sure you understand the upgrades section of
this link

And keep it open for reference you should be fine


On Wed, 26 May 2021, 19:25 Will Dennis, <wdennis at nec-labs.com> wrote:

> Hi all,
> About to embark on my first Slurm upgrade (building from source now, into
> a versioned path /opt/slurm/<vernum>/ which is then symlinked to
> /opt/slurm/current/ for the “in-use” one…) This is a new cluster, running
> 20.11.5 (which we now know has a CVE that was fixed in 20.11.7) but I have
> researchers running jobs on it currently. As I’m still building out the
> cluster, I found today that all Slurm source tarballs before 20.11.7 were
> withdrawn by SchedMD. So, need to upgrade at least the -ctld and -dbd nodes
> before I can roll any new nodes out on 20.11.7…
> As I have at least one researcher that is running some long multi-day
> jobs, can I down the -dbd and -ctld nodes and upgrade them, then put them
> back online running the new (latest) release, without munging the jobs on
> the running worker nodes?
> Thanks!
> Will
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210526/a5340498/attachment.htm>

More information about the slurm-users mailing list