[slurm-users] HA for slurmdbd
Xand Meaden
xand.meaden at kcl.ac.uk
Tue Feb 15 15:46:55 UTC 2022
Hello,
I'm wondering what others are doing to make their slurmdbd service
resilient? We have the following setup right now:
- two VMs running slurmctld (and also slurmdbd)
- shared storage for StateSaveLocation using CephFS
- three-way mysql cluster using Percona XtraDB
However I can see no "Slurm native" way to make slurmdbd resilient -
there is no option for a backup server in slurm.conf. I naively tried
setting the AccountingStorageHost to "localhost" but this only worked on
the primary control node.
Can we use something like Keepalived to present slurmdbd running on both
control nodes via a floating IP, or will this cause complications with
Slurm's use of it?
Thanks for any advice,
Xand
More information about the slurm-users
mailing list