[slurm-users] SlurmDBD losing connection to the backend MariaDB
rchang.lists at gmail.com
Wed Nov 2 01:49:42 UTC 2022
Does it mean it is best to use a single slurmdbd host in my case?
My primary slurmctld is the backup slurmdbd host, and my worry is if the
primary slurmdbd host ( which is also the mariadb server) goes down,
will the backup slurmdbd be able to cache data and wait till the mariadb
catches up ?
On 11/2/2022 2:00 AM, Brian Andrus wrote:
> Fair enough, it is actually slurmctld that does the caching. Technical
> typo on my part there.
> Just trying to let the user know, there is a window that they have to
> ensure no information is lost during a database outage.
> Brian Andrus
> On 11/1/2022 1:43 AM, Ole Holm Nielsen wrote:
>> Hi Brian,
>> On 11/1/22 05:28, Brian Andrus wrote:
>>> It caches up to a point. As I understand it, that is about an hour
>>> (depending on size and how busy the cluster is, as well as available
>>> memory, etc).
>> Have you found any documentation of slurmdbd caching? It's
>> well-known that slurmctld caches information while slurmdbd is down,
>> see for example page 30 in the talk "Field Notes Mark 2: Random
>> Musings From Under A New Hat" by Tim Wickberg, SchedMD:
>>> For slurmdbd, the critical element in the failure domain is
>>> MySQL, not slurmdbd. slurmdbd itself is stateless.
>>> ● slurmctld will cache accounting records (up to a limit) if
>>> slurmdbd is unavailable. This can be hours+ to days+
>>> depending on your system without data loss.
>> The statelessness of slurmdbd makes me think that it can't cache any
>>  https://slurm.schedmd.com/publications.html
>>> On 10/31/2022 9:20 PM, Richard Chang wrote:
>>>> Just for my info, I would like to know what happens when SlurmDBD
>>>> loses connection to the backend Database, for ex, MariaDB.
>>>> Does it cache the accounting info and keep them till the DB comes
>>>> back up ?, or does it panic and shut down ?
More information about the slurm-users