[slurm-users] Maintaining slurm config files for test and production clusters

Brian Andrus toomuchit at gmail.com
Wed Jan 4 18:53:10 UTC 2023


One of the simple ways I have dealt with different configs is to symlink 
/etc/slurm/slurm.conf to the appropriate file (eg: slurm-dev.conf and 
slurm-prod.conf)


In fact, I use the symlink for my dev and nothing (configless) for prod. 
Then I can change a running node to/from dev/prod by merely 
creating/deleting the symlink and restarting slurmd.


Just an option that may work for you.

I also use separate repos for prod/dev when I am working on 
packages/testing. I rather prefer that separation so I don't have 
someone accidentally update to a package that is not production-ready.


Brian Andrus


On 1/4/2023 9:22 AM, Groner, Rob wrote:
> We currently have a test cluster and a production cluster, all on the 
> same network.  We try things on the test cluster, and then we gather 
> those changes and make a change to the production cluster.  We're 
> doing that through two different repos, but we'd like to have a single 
> repo to make the transition from testing configs to publishing them 
> more seamless.  The problem is, of course, that the test cluster and 
> production clusters have different cluster names, as well as different 
> nodes within them.
>
> Using the include directive, I can pull all of the NodeName lines out 
> of slurm.conf and put them into %c-nodes.conf files, one for 
> production, one for test.  That still leaves me with two problems:
>
>   * The clustername itself will still be a problem.  I WANT the same
>     slurm.conf file between test and production...but the clustername
>     line will be different for them both.  Can I use an env var in
>     that cluster name, because on production there could be a
>     different env var value than on test?
>   * The gres.conf file.  I tried using the same "include" trick that
>     works on slurm.conf, but it failed because it did not know what
>     the "ClusterName" was.  I think that means that either it doesn't
>     work for anything other than slurm.conf, or that the clustername
>     will have to be defined in gres.conf as well?
>
> Any other suggestions of how to keep our slurm files in a single 
> source control repo, but still have the flexibility to have them run 
> elegantly on either test or production systems?
>
> Thanks.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230104/54d3ca0e/attachment.htm>


More information about the slurm-users mailing list