[slurm-users] Upgrading a slurm on a cluster, 17.02 --> 18.08
Chris Samuel
chris at csamuel.org
Tue Sep 25 06:00:27 MDT 2018
On Tuesday, 25 September 2018 9:41:10 PM AEST Baker D. J. wrote:
> I guess that the only solution is to upgrade all the slurm at once. That
> means that the slurmctld will be killed (unless it has been stopped first).
We don't use RPMs from Slurm [1], but the rpm command does have a --noscripts
option to (allegedly, I've never used it) suppress the execution of pre/post
install scripts.
A big warning would be do not use systemctl to start the new slurmdbd for the
first time when upgrading!
Stop the older one first (and then take a database dump) and then run the new
slurmdbd process with the "-Dvvv" options (inside screen, just in case) so
that you can watch its progress and systemd won't decide it's been taking too
long to start and try and kill it part way through the database upgrades).
Once that's completed successfully then you can ^C it and start it up via the
systemctl once more.
Hope that's useful!
All the best,
Chris
[1] - I've always installed into ${shared_local_area}/slurm/${version} and had
a symlink called "latest" that points at the currently blessed version of
Slurm. Then I stop slurmdbd, upgrade that as above, then I can do slurmctld
(with partitions marked down, just in case). Once those are done I can
restart slurmd's around the cluster.
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
More information about the slurm-users
mailing list