The deployment scenario is as follows:
nodeA nodeB
(slurmctld) (backup slurmctld)
| -------------------------------/ |
| / \ |
nodeC nodeD
(slurmdbd) (backup slurmdbd)
(mysql) <--multi master replica--> (mysql)
Since the database is multi-master replicated, the slurmdbd should only talk to the mysql on its own node.
In such case, how should we set the slurmdbd.conf? The conf file contains options "DbdAddr", "DbdHost" and "DbdBackupHost".
Should they be consistent between nodeA-2 and nodeB-2? Such as:
DbdAddr = nodeC | DbdAddr = nodeC
DbdHost = nodeC | DbdHost = nodeC
DbdBackupHost = nodeD | DbdBackupHost = nodeD
StorageHost = nodeC | StorageHost = nodeD
Or maybe just set different conf and don't use the "DbdBackupHost" like:
DbdAddr = nodeC | DbdAddr = nodeD
DbdHost = nodeC | DbdHost = nodeD
StorageHost = nodeC | StorageHost = nodeD
I'm quite confused about the usage of DbdAddr and DbdHost. What is the difference between them and why only DbdHost has the backup one?
Another confusing point is how DbdBackupHost work. I guess It is slurmctld that is responsible for selecting the available slurmdbd. Since the slurm.conf already contains "AccountingStorageHost" and "AccountingStorageBackupHost", why we need set backupdbd again on slurmdbd side?