[slurm-users] Upgrading a slurm on a cluster, 17.02 --> 18.08

Tue Sep 25 06:00:27 MDT 2018

On Tuesday, 25 September 2018 9:41:10 PM AEST Baker D. J.  wrote:

> I guess that the only solution is to upgrade all the slurm at once. That
> means that the slurmctld will be killed (unless it has been stopped first).

We don't use RPMs from Slurm [1], but the rpm command does have a --noscripts 
option to (allegedly, I've never used it) suppress the execution of pre/post 
install scripts.

A big warning would be do not use systemctl to start the new slurmdbd for the 
first time when upgrading!

Stop the older one first (and then take a database dump) and then run the new 
slurmdbd process with the "-Dvvv" options (inside screen, just in case) so 
that you can watch its progress and systemd won't decide it's been taking too 
long to start and try and kill it part way through the database upgrades).

Once that's completed successfully then you can ^C it and start it up via the 
systemctl once more.

Hope that's useful!

All the best,
Chris

[1] - I've always installed into ${shared_local_area}/slurm/${version} and had 
a symlink called "latest" that points at the currently blessed version of 
Slurm.  Then I stop slurmdbd, upgrade that as above, then I can do slurmctld 
(with partitions marked down, just in case).  Once those are done I can 
restart slurmd's around the cluster.

-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC