[slurm-users] Maintaining slurm config files for test and production clusters
Paul Edmon
pedmon at cfa.harvard.edu
Wed Jan 4 18:59:02 UTC 2023
The symlink method for slurm.conf is what we do as well. We have a NFS
mount from the slurm master that we host the slurm.conf on that we then
symlink slurm.conf to that NFS share.
-Paul Edmon-
On 1/4/2023 1:53 PM, Brian Andrus wrote:
>
> One of the simple ways I have dealt with different configs is to
> symlink /etc/slurm/slurm.conf to the appropriate file (eg:
> slurm-dev.conf and slurm-prod.conf)
>
>
> In fact, I use the symlink for my dev and nothing (configless) for
> prod. Then I can change a running node to/from dev/prod by merely
> creating/deleting the symlink and restarting slurmd.
>
>
> Just an option that may work for you.
>
> I also use separate repos for prod/dev when I am working on
> packages/testing. I rather prefer that separation so I don't have
> someone accidentally update to a package that is not production-ready.
>
>
> Brian Andrus
>
>
> On 1/4/2023 9:22 AM, Groner, Rob wrote:
>> We currently have a test cluster and a production cluster, all on the
>> same network. We try things on the test cluster, and then we gather
>> those changes and make a change to the production cluster. We're
>> doing that through two different repos, but we'd like to have a
>> single repo to make the transition from testing configs to publishing
>> them more seamless. The problem is, of course, that the test cluster
>> and production clusters have different cluster names, as well as
>> different nodes within them.
>>
>> Using the include directive, I can pull all of the NodeName lines out
>> of slurm.conf and put them into %c-nodes.conf files, one for
>> production, one for test. That still leaves me with two problems:
>>
>> * The clustername itself will still be a problem. I WANT the same
>> slurm.conf file between test and production...but the clustername
>> line will be different for them both. Can I use an env var in
>> that cluster name, because on production there could be a
>> different env var value than on test?
>> * The gres.conf file. I tried using the same "include" trick that
>> works on slurm.conf, but it failed because it did not know what
>> the "ClusterName" was. I think that means that either it doesn't
>> work for anything other than slurm.conf, or that the clustername
>> will have to be defined in gres.conf as well?
>>
>> Any other suggestions of how to keep our slurm files in a single
>> source control repo, but still have the flexibility to have them run
>> elegantly on either test or production systems?
>>
>> Thanks.
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230104/eef82436/attachment.htm>
More information about the slurm-users
mailing list