[slurm-users] Maintaining slurm config files for test and production clusters

Paul Edmon pedmon at cfa.harvard.edu
Wed Jan 4 18:59:02 UTC 2023


The symlink method for slurm.conf is what we do as well. We have a NFS 
mount from the slurm master that we host the slurm.conf on that we then 
symlink slurm.conf to that NFS share.


-Paul Edmon-


On 1/4/2023 1:53 PM, Brian Andrus wrote:
>
> One of the simple ways I have dealt with different configs is to 
> symlink /etc/slurm/slurm.conf to the appropriate file (eg: 
> slurm-dev.conf and slurm-prod.conf)
>
>
> In fact, I use the symlink for my dev and nothing (configless) for 
> prod. Then I can change a running node to/from dev/prod by merely 
> creating/deleting the symlink and restarting slurmd.
>
>
> Just an option that may work for you.
>
> I also use separate repos for prod/dev when I am working on 
> packages/testing. I rather prefer that separation so I don't have 
> someone accidentally update to a package that is not production-ready.
>
>
> Brian Andrus
>
>
> On 1/4/2023 9:22 AM, Groner, Rob wrote:
>> We currently have a test cluster and a production cluster, all on the 
>> same network.  We try things on the test cluster, and then we gather 
>> those changes and make a change to the production cluster.  We're 
>> doing that through two different repos, but we'd like to have a 
>> single repo to make the transition from testing configs to publishing 
>> them more seamless.  The problem is, of course, that the test cluster 
>> and production clusters have different cluster names, as well as 
>> different nodes within them.
>>
>> Using the include directive, I can pull all of the NodeName lines out 
>> of slurm.conf and put them into %c-nodes.conf files, one for 
>> production, one for test.  That still leaves me with two problems:
>>
>>   * The clustername itself will still be a problem.  I WANT the same
>>     slurm.conf file between test and production...but the clustername
>>     line will be different for them both.  Can I use an env var in
>>     that cluster name, because on production there could be a
>>     different env var value than on test?
>>   * The gres.conf file.  I tried using the same "include" trick that
>>     works on slurm.conf, but it failed because it did not know what
>>     the "ClusterName" was.  I think that means that either it doesn't
>>     work for anything other than slurm.conf, or that the clustername
>>     will have to be defined in gres.conf as well?
>>
>> Any other suggestions of how to keep our slurm files in a single 
>> source control repo, but still have the flexibility to have them run 
>> elegantly on either test or production systems?
>>
>> Thanks.
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230104/eef82436/attachment.htm>


More information about the slurm-users mailing list