[slurm-users] Migrate the slurmdbd service to another server

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Fri Mar 1 13:10:08 UTC 2019


We're one of the many Slurm sites which run the slurmdbd database daemon 
on the same server as the slurmctld daemon.  This works without problems 
at our site given our modest load, however, SchedMD recommends to run 
the daemons on separate servers.

Contemplating how to upgrade our cluster from Slurm 17.11 to 18.08, I've 
come to appreciate the advantage of running the daemons on separate 
servers: One can upgrade slurmdbd to 18.08 while keeping slurmctld at 
17.11 (for a while at least).  This enables us to upgrade to 18.08 in 
the recommended order without any interruption to our running jobs and 
without any cluster downtime.

I've been collecting various pieces of information about Slurm upgrades 
and I've come up with a tested procedure for migrating the slurmdbd 
service (on a CentOS/RHEL 7 system) to a new server:

https://wiki.fysik.dtu.dk/niflheim/Slurm_database#migrate-the-slurmdbd-service-to-another-server

The basic idea is that slurmctld continues happily while slurmdbd is 
down, so you can migrate the MySQL database and slurmdbd behind the 
scenes.  When the new slurmdbd server is up and running, you reconfigure 
slurm.conf on the cluster.

Upgrading slurmctld and slurmd is another topic, and this is discussed 
in my Wiki page 
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#upgrading-slurm.

I'd appreciate comments and suggestions about my procedure.

/Ole

-- 
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark



More information about the slurm-users mailing list