[slurm-users] Migrate the slurmdbd service to another server
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Mon Mar 4 13:15:55 UTC 2019
On 3/4/19 1:27 PM, Paul Edmon wrote:
> That should work.
It did work as expected :-)
> The upgrade though will have to wait until the dbd is actually on a different server.
Yes, that's the whole point of first migrating slurmdbd to a different
server! Upgrading the Slurm RPMs on the slurmdbd server from 17.11 to
18.08 consequently doesn't impact the running slurmctld. I did so this
morning without any troubles at all.
> We run the ctld and dbd on the same machine here for the sake of performance. Before the rpm reorg we used to upgrade only the dbd first and then the ctld but with the reorg I'm taking a downtime for the dbd upgrade. That's not too bad though as we pause all our jobs out of paranoia for upgrades.
My strategy is to avoid any downtime at all = lost productivity.
> On 3/1/19 8:10 AM, Ole Holm Nielsen wrote:
>> We're one of the many Slurm sites which run the slurmdbd database
>> daemon on the same server as the slurmctld daemon. This works without
>> problems at our site given our modest load, however, SchedMD
>> recommends to run the daemons on separate servers.
>> Contemplating how to upgrade our cluster from Slurm 17.11 to 18.08,
>> I've come to appreciate the advantage of running the daemons on
>> separate servers: One can upgrade slurmdbd to 18.08 while keeping
>> slurmctld at 17.11 (for a while at least). This enables us to upgrade
>> to 18.08 in the recommended order without any interruption to our
>> running jobs and without any cluster downtime.
>> I've been collecting various pieces of information about Slurm
>> upgrades and I've come up with a tested procedure for migrating the
>> slurmdbd service (on a CentOS/RHEL 7 system) to a new server:
>> The basic idea is that slurmctld continues happily while slurmdbd is
>> down, so you can migrate the MySQL database and slurmdbd behind the
>> scenes. When the new slurmdbd server is up and running, you
>> reconfigure slurm.conf on the cluster.
>> Upgrading slurmctld and slurmd is another topic, and this is discussed
>> in my Wiki page
>> I'd appreciate comments and suggestions about my procedure.
More information about the slurm-users