[slurm-users] Setup for backup slurmctld

Brian Andrus toomuchit at gmail.com
Wed Feb 26 20:56:27 UTC 2020


Any shared filesystem that both systems can get to will work.

I have done it with NFS, Gluster, appliances (NetApp), etc.

Being in a separate datacenter is fine, but you will see some latency, 
which you likely already addressed if you are pysically splitting a 
network like that.

Also, very easy to do. Just add the lines to your slurm.conf for the 
backup controller, start it up and reconfigure for all running nodes to 
be aware of it.

Brian Andrus

On 2/26/2020 12:48 PM, Joshua Baker-LePain wrote:
> We're planning the migration of our moderately sized cluster (~400 
> nodes, 40K jobs/day) from SGE to slurm.  We'd very much like to have a 
> backup slurmctld, and it'd be even better if our backup slurmctld 
> could be in a separate data center from the primary (though they'd 
> still be on the same private network).  So, how are folks sharing the 
> StateSaveLocation in such a setup?  Any and all recommendations 
> (including those with the 2 slurmctld servers in the same rack) 
> welcome.  Thanks!
>



More information about the slurm-users mailing list