[slurm-users] Migrate the slurmdbd service to another server
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Mon Mar 4 18:12:10 UTC 2019
On 04-03-2019 16:30, Loris Bennett wrote:
>> On 3/4/19 2:26 PM, Loris Bennett wrote:
>>> Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk> writes:
>>>> We're one of the many Slurm sites which run the slurmdbd database daemon on the
>>>> same server as the slurmctld daemon. This works without problems at our site
>>>> given our modest load, however, SchedMD recommends to run the daemons on
>>>> separate servers.
>>>> Contemplating how to upgrade our cluster from Slurm 17.11 to 18.08, I've come to
>>>> appreciate the advantage of running the daemons on separate servers: One can
>>>> upgrade slurmdbd to 18.08 while keeping slurmctld at 17.11 (for a while at
>>>> least). This enables us to upgrade to 18.08 in the recommended order without
>>>> any interruption to our running jobs and without any cluster downtime.
>>> Can't one do this even with only one server? We have always run both
>>> slurmctld and slurmdbd on one machine and have performed all the updates
>>> without any downtime.
>> For minor upgrade 17.11.x to 17.11.y there is no issue because the MySQL
>> database layout is unchanged.
>> Major upgrades such as 17.11 to 18.08 is potentially more risky, see for example
>> this list thread "Extreme long db upgrade 16.05.6 -> 17.11.3":
>> I recommend to study the instructions in
> That is indeed the protocol we follow.
>> See also the slides on "Upgrading" in
>> https://slurm.schedmd.com/SLUG18/field_notes2.pdf from the SLUG meeting 2018
>> Updating the database layout during a Slurm major upgrade can in special
>> situations lead to problems, so it's safer to do the upgrade separately for
>> slurmdbd and slurmctld. This is why I've decided to move my slurmdbd and
>> database to a separate server now. The slurmctld which governs the entire
>> cluster is thereby unaffected as I "play" with the database upgrade, and I can
>> upgrade Slurm without any cluster downtime.
> I don't understand how the separation of the two services onto two
> machines in the production environment makes such a difference. No
> matter where the slurmdbd is running, the slurmcltd will attempt to
> contact it and cache data if the slurmdbd is unreachable. Or is the
> point more that, with a second machine you can do an offline conversion
> of the database, i.e. it is good to have a test and a production
This is a nice discussion! My reasoning is:
If slurmdbd and slurmctld both run on the same machine, you MUST upgrade
the RPMs simultaneously, for example, 17.11.13 to 18.08.5. When
slurmdbd runs on a separate machine, you can upgrade that one without
Mind you, SchedMD's recommended incremental sequence of upgrading is
these enumerated steps:
3. slurmd (on nodes)
4. Slurm commands (on login hosts)
There is a risk involved in lumping steps 1+2 together into one step,
especially if the database upgrade somehow has a problem or takes a very
long time. What if you're forced to roll back and downgrade slurmdbd to
the old version: In this case problems may arise by downgrading
slurmctld at the same time.
A crucial part of slurmctld is the StateSaveLocation
(/var/spool/slurmctld) directory which is being updated all the time due
to cluster activity. You don't want to compromise the operation of
slurmctld while upgrading slurmdbd.
I certainly recommend testing and timing the database and slurmdbd
upgrade on a non-production node before the real upgrade.
> On the other hand, the Quick Start Addmin Guide
> (https://slurm.schedmd.com/quickstart_admin.html) does mention "head
> node, compute nodes, and slurmdbd node". I had always assumed a
> separate slurmdbd node was mainly useful for performance reasons at
> sites will a huge throughput of jobs, but maybe I am missing something.
For me safety of upgrading is most important. You're right that
high-throughput will want to separate the dbd and ctld services for
More information about the slurm-users