[slurm-users] [EXT] slurmctld error

ibotsis at isc.tuc.gr ibotsis at isc.tuc.gr
Mon Apr 5 19:00:01 UTC 2021

Hi Sean, is the dbd and ctld host with name se01. Firewall is inactive……


nc -nz 6819 || echo Connection not working


give me back …..  Connection not working





From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Sean Crosby
Sent: Monday, April 5, 2021 2:52 PM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] [EXT] slurmctld error


The error shows

slurmctld: debug2: Error connecting slurm stream socket at <> : Connection refused

slurmctld: error: slurm_persist_conn_open_without_init: failed to open persistent connection to se01:6819: Connection refused


Is the IP address of the host running slurmdbd?

If so, check the iptables firewall running on that host, and make sure the ctld server can access port 6819 on the dbd host.

You can check this by running the following from the ctld host (requires the package nmap-ncat installed)

nc -nz 6819 || echo Connection not working

This will try connecting to port 6819 on the host, and output nothing if the connection works, and would output Connection not working otherwise

I would also test this on the DBD server itself


Sean Crosby | Senior DevOpsHPC Engineer and HPC Team Lead
Research Computing Services | Business Services
The University of Melbourne, Victoria 3010 Australia



On Mon, 5 Apr 2021 at 21:00, Ioannis Botsis <ibotsis at isc.tuc.gr <mailto:ibotsis at isc.tuc.gr> > wrote:

UoM notice: External email. Be cautious of links, attachments, or impersonation attempts



Hi Sean,


Thank you for your prompt response,  I made the changes you suggested, slurmctld refuse running……. find attached new slurmctld -Dvvvv






From: slurm-users <slurm-users-bounces at lists.schedmd.com <mailto:slurm-users-bounces at lists.schedmd.com> > On Behalf Of Sean Crosby
Sent: Monday, April 5, 2021 11:46 AM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] [EXT] slurmctld error


Hi Jb,


You have set AccountingStoragePort to 3306 in slurm.conf, which is the MySQL port running on the DBD host.


AccountingStoragePort is the port for the Slurmdbd service, and not for MySQL.


Change AccountingStoragePort to 6819 and it should fix your issues.


I also think you should comment out the lines 




You shouldn't need those lines




Sean Crosby | Senior DevOpsHPC Engineer and HPC Team Lead
Research Computing Services | Business Services
The University of Melbourne, Victoria 3010 Australia



On Mon, 5 Apr 2021 at 18:03, Ioannis Botsis <ibotsis at isc.tuc.gr <mailto:ibotsis at isc.tuc.gr> > wrote:

UoM notice: External email. Be cautious of links, attachments, or impersonation attempts



Hello everyone,


I installed the slurm 19.05.5 from Ubuntu repo,  for the first time in a cluster with 44  identical nodes but I have problem with slurmctld.service


When I try to activate slurmctd I get the following message…


fatal: You are running with a database but for some reason we have no TRES from it.  This should only happen if the database is down and you don't have any state files


*	Ubuntu 20.04.2 runs on the server and nodes in the exact same version.
*	munge 0.5.13 installed from Ubuntu repo running on server and nodes.
*	mysql  Ver 8.0.23-0ubuntu0.20.04.1 for Linux on x86_64 ((Ubuntu))  installed from ubuntu repo running on server.


slurm.conf is the same on all nodes and on server.


slurmd.service is active and running on all nodes without problem.


mysql.service is active and running on server.

slurmdbd.service is active and running on server (slurm_acct_db created).


Find attached slurm.conf slurmdbd.com <http://slurmdbd.com>   and detailed output of slurmctld -Dvvvv  command.


Any hint?


Thanks in advance






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210405/dc586d41/attachment.htm>

More information about the slurm-users mailing list