[slurm-users] Database cluster

Daniel L'Hommedieu dlhommedieu at gmail.com
Tue Jan 23 13:38:56 UTC 2024


Hi Diego.

In our setup, the database is critical.  We have some wrapper scripts that consult the database for information, and we also set environment variables on login, based on user/partition associations.  If the database is down, none of those things work.

I doubt there is appetite in the organization to change the way our setup works, but if we can improve database reliability, that would be a good solution.  Mostly I am interested in protecting from hardware failure, and that’s why I’m interested in a cluster solution such as XtraDB.

Thanks.

Daniel

> On Jan 23, 2024, at 03:23, Diego Zuccato <diego.zuccato at unibo.it> wrote:
> 
> IIUC the database is not "critical": if it goes down, you lose access to some statistics. But job data gets cached anyway and the db will be updated when it comes back online.
> 
> Diego
> 
> Il 22/01/2024 18:23, Daniel L'Hommedieu ha scritto:
>> Community:
>> What do you do to ensure database reliability in your SLURM environment?  We can have multiple controllers and multiple slurmdbds, but my understanding is that slurmdbd can be configured with a single MySQL server, so what do you do?  Do you have that “single MySQL server” be a cluster, such as Percona XtraDB?  Do you use MySQL replication, then manually switch to slurmdbd to a replication slave if the master goes down?  Do you do something else?
>> Thanks.
>> Daniel
> 
> -- 
> Diego Zuccato
> DIFA - Dip. di Fisica e Astronomia
> Servizi Informatici
> Alma Mater Studiorum - Università di Bologna
> V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
> tel.: +39 051 20 95786
> 




More information about the slurm-users mailing list