[slurm-users] dual slurmctld and slurmdbd

Brian Andrus toomuchit at gmail.com
Thu Jul 4 00:54:52 UTC 2019

Your welcome :)

If you aren't pleased with the timeouts, you may want to look at the 
SlurmctldTimeout in slurm.conf:

The interval, in seconds, that the backup controller waits for the 
primary controller to respond before assuming control. The default value 
is 120 seconds. May not exceed 65533.

Brian Andrus

On 7/3/2019 2:45 PM, Tina Fora wrote:
> Thanks Brian Andrus and Chris Samuel.
> I was able to get it to work on our dev setup as primary/backup. Already
> had the shared state directory. If I take primary down it takes about two
> minutes for slurm commands to work again as the backup takes over. When I
> bring the primary back up it is a bit faster.
> Cheers.
>> On 2/7/19 1:48 pm, Tina Fora wrote:
>>> We run mysql on a dedicated machine with slurmctld and slurmdbd running
>>> on
>>> another machine. Now I want to add another machine running slurmctld and
>>> slurmdbd and this machine with be on CentOS 7. Existing one is CentOS 6.
>>> Is this possible? Can I run two seperate slurmctld and slurmdbd point to
>>> the same slurm config and database?
>> Are you trying to set up an HA system (where one controller runs both
>> and a second waits in the wings in case the first fails and will take
>> over)?
>> Or do you want them to run separate clusters?
>> If you want the second, and are happy to have the same users and QOS's
>> on both, then you can run one slurmctld per system and point them at the
>> same slurmdbd (having created a cluster for each there first).
>> If you want HA then it's a lot more complicated as you'll need a (fast)
>> shared filesystem between them both (we use GPFS for this) as both
>> slurmctld's need to see the same state directory all the time.
>> We also run slurmdbd in failover mode talking to the same MySQL/MariaDB
>> instance (but with a backup in case that fails).
>> All the best,
>> Chris
>> --
>>    Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA

More information about the slurm-users mailing list