Go look at that system. Likely it has an old slurmd running that should be removed.
Brian Andrus
On 9/20/2025 5:12 PM, Dhumal, Dr. Nilesh via slurm-users wrote:
Thanks. It resolve the issue. Now slrumcld is commuication to all nodes. These nodes are not listed the slurm.conf file.
error: unpack_header: protocol_version 8960 not supported [2025-09-20T20:05:03.455] error: destroy_forward: no init [2025-09-20T20:05:03.461] error: slurm_unpack_received_msg: [192.168.3.119:55650] Incompatible versions of client and server code
192.168.3.119 is not mentioned in the slrum.conf file.
*Nilesh Dhumal*
Associate Professor of Chemistry,
http://faculty.fgcu.edu/ndhumal/ http://faculty.fgcu.edu/ndhumal/
Coordinator, FGCU Computational Facility,
https://www.fgcu.edu/cas/facultyresources/computationalfacility/ https://www.fgcu.edu/cas/facultyresources/computationalfacility/ SH-430; Department of Chemistry and Physics Florida Gulf Coast University 10501 FGCU Boulevard South Fort Myers, FL 33965-6565 Phone: (239) 745-4394 Email: ndhumal@fgcu.edu
*From:* Renfro, Michael Renfro@tntech.edu *Sent:* Saturday, September 20, 2025 12:21 PM *To:* Dhumal, Dr. Nilesh ndhumal@fgcu.edu; Slurm User Community List slurm-users@lists.schedmd.com *Subject:* Re: Compute node not responding
*External Email*: Do not click links or attachments unless you recognize the sender and know the content is safe.
Level II: Internal
slurmdbd is not a requirement to get things started [1], but you'll probably want it later.
It’s possible you’ve got host-based firewall rules on either system that are blocking communication. If you’re using firewalld, ufw, or something similar, stop their services, restart the slurmd and slurmctld services, and see if that helps.
[1] https://slurm.schedmd.com/quickstart_admin.html#dbd
Level II: Internal
From: *Dhumal, Dr. Nilesh via slurm-users slurm-users@lists.schedmd.com *Date: *Saturday, September 20, 2025 at 11:02 AM *To: *Slurm User Community List slurm-users@lists.schedmd.com *Subject: *[slurm-users] Compute node not responding
*External Email Warning*
*This email originated from outside the university. Please use caution when opening attachments, clicking links, or responding to requests.*
Hello,
Recently, we installed slum 25 on our cluster. We are not monitoring the user's account. We didn't configure the sql database on the head node. We are running slurmcld on head node and slumd on the compute node. We are getting the following error Head node: compute node not responding. Compute node: 2025-09-19T15:30:23.461] error: Unable to register: Unable to contact slurm controller (connect failure)
Do we need to run slumdbd on the head node? I checked the network connection by pinging the compute node from the head node. Do you have any suggestions to resolve this issue?
Thanks Nilesh
Get Outlook for Android https://aka.ms/AAb9ysg