<div dir="ltr"><div class="gmail_default" style="font-family:monospace">I'll apologize because I don't have a complete answer. I'm not sure why that doesn't work, but my understanding of how it should work for failover scenarios is a "SlurmctldHost" line for each of the controllers, e.g.:</div><div class="gmail_default" style="font-family:monospace"><br></div><div class="gmail_default" style="font-family:monospace"><div class="gmail_default">SlurmctldHost=host1</div><div class="gmail_default"><div class="gmail_default">SlurmctldHost=host2</div><div class="gmail_default">...</div><div class="gmail_default"><br></div><div class="gmail_default">The list format seems to be used in some other scenario I don't completely understand. We're using the multiple lines for our HA arrangement and it seems to be working OK.</div><div class="gmail_default"><br></div><div class="gmail_default"> - Michael</div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Dec 13, 2023 at 12:18 PM Jackson, Gary L. <<a href="mailto:Gary.Jackson@jhuapl.edu">Gary.Jackson@jhuapl.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="msg7299019380862922547"><div lang="EN-US" style="overflow-wrap: break-word;"><div class="m_7299019380862922547WordSection1"><p class="MsoNormal">The SlurmctldHost value is set like the following in my slurm.conf:<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="m_7299019380862922547p1"><span class="m_7299019380862922547s1">SlurmctldHost=host0,host1</span><u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">That seems to be legal according to the documentation. However, I get error messages like the following:<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="m_7299019380862922547p1"><span class="m_7299019380862922547s1">$ srun id</span><u></u><u></u></p><p class="m_7299019380862922547p1"><span class="m_7299019380862922547s1">srun: error: get_addr_info: getaddrinfo() failed: Name or service not known</span><u></u><u></u></p><p class="m_7299019380862922547p1"><span class="m_7299019380862922547s1">srun: error: slurm_set_addr: Unable to resolve "host0,host1"</span><u></u><u></u></p><p class="m_7299019380862922547p1"><span class="m_7299019380862922547s1">srun: error: Unable to establish control machine address</span><u></u><u></u></p><p class="m_7299019380862922547p1"><span class="m_7299019380862922547s1">srun: error: Unable to allocate resources: Address already in use</span><u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">If I try to put IP addresses in parentheses per the documentation, I get different errors:<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="m_7299019380862922547p1"><span class="m_7299019380862922547s1">$ srun id</span><u></u><u></u></p><p class="m_7299019380862922547p1"><span class="m_7299019380862922547s1">srun: error: Bad value "host0(12.34.56.78),host1" for SlurmctldHost</span><u></u><u></u></p><p class="m_7299019380862922547p1"><span class="m_7299019380862922547s1">srun: error: No SlurmctldHost defined.</span><u></u><u></u></p><p class="m_7299019380862922547p1"><span class="m_7299019380862922547s1">srun: fatal: Unable to process configuration file</span><u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">If I put a single hostname, or a hostname with an address in parentheses as the value for SlurmctldHost, it works fine but I have no failover.<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">I’m running 23.02.6:<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="m_7299019380862922547p1"><span class="m_7299019380862922547s1">$ sinfo --version</span><u></u><u></u></p><p class="m_7299019380862922547p1"><span class="m_7299019380862922547s1">slurm 23.02.6</span><u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">What’s going on?<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><div><div><p class="MsoNormal"><span>-- <u></u><u></u></span></p></div></div><p class="MsoNormal"><span>Gary</span><u></u><u></u></p></div></div>
</div></blockquote></div>