Hi All,
I’m trying to get slurm-23.11.3 running on Ubuntu 20.04 and running on a stand alone system. I’m running into an issue I can not find the answer to. After compiling and installing when I fire up slurmctld and slurmd I get an error from sinfo:
sinfo: error: resolve_ctls_from_dns_srv: res_nsearch error: Unknown host
sinfo: error: fetch_config: DNS SRV lookup failed
sinfo: error: _establish_config_source: failed to fetch config
sinfo: fatal: Could not establish a configuration source
I looks like a DNS issue but the system has no issue resolving to its hostname or localhost. The slurm.conf file is also being read properly as I have the logs directed to a place convenient to me. I see lots have had these same issues but cannot find a clear resolution.
I have slurm running on a stand alone system in another lab with and identical setup without issue. Any advice would be greatly appreciated.
Thanks,
Mike
That error means it is not finding the slurm.conf file, so it is trying to do config-less mode and querying DNS for the SRV records.
Verify that you have /etc/slurm/slurm.conf on the system you are running that on
Also ensure you don't have any environment variables set that tell it to look elsewhere.
Brian Andrus
On 1/26/2024 6:38 AM, Michael Lewis wrote:
Hi All,
I’m trying to get slurm-23.11.3 running on Ubuntu 20.04 and running on a stand alone system. I’m running into an issue I can not find the answer to. After compiling and installing when I fire up slurmctld and slurmd I get an error from sinfo:
sinfo: error: resolve_ctls_from_dns_srv: res_nsearch error: Unknown host
sinfo: error: fetch_config: DNS SRV lookup failed
sinfo: error: _establish_config_source: failed to fetch config
sinfo: fatal: Could not establish a configuration source
I looks like a DNS issue but the system has no issue resolving to its hostname or localhost. The slurm.conf file is also being read properly as I have the logs directed to a place convenient to me. I see lots have had these same issues but cannot find a clear resolution.
I have slurm running on a stand alone system in another lab with and identical setup without issue. Any advice would be greatly appreciated.
Thanks,
Mike
Thank you Brian, I do have the slurm.conf file and the system did read it properly as I was getting the logs in the directed spot. What I had to do was set the SLURM_CONF environment variable to point ot the slurm.conf file. All good now it seems.
Mike
From: slurm-users slurm-users-bounces@lists.schedmd.com on behalf of Brian Andrus toomuchit@gmail.com Reply-To: Slurm User Community List slurm-users@lists.schedmd.com Date: Friday, January 26, 2024 at 12:06 PM To: "slurm-users@lists.schedmd.com" slurm-users@lists.schedmd.com Subject: Re: [slurm-users] sinfo: error: resolve_ctls_from_dns_srv: res_nsearch error: Unknown host
You don't often get email from toomuchit@gmail.com. Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification
That error means it is not finding the slurm.conf file, so it is trying to do config-less mode and querying DNS for the SRV records.
Verify that you have /etc/slurm/slurm.conf on the system you are running that on
Also ensure you don't have any environment variables set that tell it to look elsewhere.
Brian Andrus On 1/26/2024 6:38 AM, Michael Lewis wrote: Hi All,
I’m trying to get slurm-23.11.3 running on Ubuntu 20.04 and running on a stand alone system. I’m running into an issue I can not find the answer to. After compiling and installing when I fire up slurmctld and slurmd I get an error from sinfo:
sinfo: error: resolve_ctls_from_dns_srv: res_nsearch error: Unknown host
sinfo: error: fetch_config: DNS SRV lookup failed
sinfo: error: _establish_config_source: failed to fetch config
sinfo: fatal: Could not establish a configuration source
I looks like a DNS issue but the system has no issue resolving to its hostname or localhost. The slurm.conf file is also being read properly as I have the logs directed to a place convenient to me. I see lots have had these same issues but cannot find a clear resolution.
I have slurm running on a stand alone system in another lab with and identical setup without issue. Any advice would be greatly appreciated.
Thanks,
Mike