[slurm-users] slurm_persist_conn_open_without_init: failed to open persistent connection to host

Sushil Mishra sushilbioinfo at gmail.com
Wed Nov 30 22:44:23 UTC 2022


Hi all,

I installed slurm and enable accounting in a single-node machine, i.e same
server is the master and computing node. I mainly followed this page for
instructions:
https://southgreenplatform.github.io/trainings/hpc/slurminstallation/
After enabling accounting I am having problems in starting
slurmctld.service.
[root at mannose sushil]# cat /var/log/slurm/slurmctld.log
[2022-11-30T16:32:15.194] Job accounting information stored, but details
not gathered
[2022-11-30T16:32:15.195] slurmctld version 20.11.9 started on cluster
mannose.olemiss.edu
[2022-11-30T16:32:15.201] error: slurm_persist_conn_open_without_init:
failed to open persistent connection to host:mannose:6819: Connection
refused
[2022-11-30T16:32:15.201] error: Sending PersistInit msg: Connection refused
[2022-11-30T16:32:15.201] accounting_storage/slurmdbd:
clusteracct_storage_p_register_ctld: Registering slurmctld at port 6817
with slurmdbd
[2022-11-30T16:32:15.203] error: Sending PersistInit msg: Connection refused
[2022-11-30T16:32:15.203] error: Association database appears down, reading
from state file.
[2022-11-30T16:32:15.203] error: Unable to get any information from the
state file
[2022-11-30T16:32:15.203] fatal: slurmdbd and/or database must be up at
slurmctld start time

It is not clear why slurm port 8619 is being used while I have
SlurmctldPort=6817 and SlurmdPort=6818 set in clurm.conf. anyways, I opened
all three posrts (6817, 6818 and 6819) using  'firewall-cmd --permanent
--zone=public --add-port=6819/tcp'

MariaDB [(none)]> show grants
    -> ;
+--------------------------------------------------------------------------------------------------------------+
| Grants for slurm at localhost
                                    |
+--------------------------------------------------------------------------------------------------------------+
| GRANT USAGE ON *.* TO 'slurm'@'localhost' IDENTIFIED BY PASSWORD
'*0E54A04D59B6C9F7B7B6269BE7F30AD3E3409895' |
| GRANT ALL PRIVILEGES ON `slurm_acct_db`.* TO 'slurm'@'localhost' WITH
GRANT OPTION                           |
+--------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

MariaDB [(none)]> quit

Can someone help in figuring out possibly what is going wrong?

Best,
SK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20221130/459d37b6/attachment.htm>


More information about the slurm-users mailing list