[slurm-users] [EXT] slurmctld error
Ioannis Botsis
ibotsis at isc.tuc.gr
Mon Apr 5 11:00:29 UTC 2021
Hi Sean,
Thank you for your prompt response, I made the changes you suggested, slurmctld refuse running……. find attached new slurmctld -Dvvvv
jb
From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Sean Crosby
Sent: Monday, April 5, 2021 11:46 AM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] [EXT] slurmctld error
Hi Jb,
You have set AccountingStoragePort to 3306 in slurm.conf, which is the MySQL port running on the DBD host.
AccountingStoragePort is the port for the Slurmdbd service, and not for MySQL.
Change AccountingStoragePort to 6819 and it should fix your issues.
I also think you should comment out the lines
AccountingStorageUser=slurm
AccountingStoragePass=/run/munge/munge.socket.2
You shouldn't need those lines
Sean
--
Sean Crosby | Senior DevOpsHPC Engineer and HPC Team Lead
Research Computing Services | Business Services
The University of Melbourne, Victoria 3010 Australia
On Mon, 5 Apr 2021 at 18:03, Ioannis Botsis <ibotsis at isc.tuc.gr <mailto:ibotsis at isc.tuc.gr> > wrote:
UoM notice: External email. Be cautious of links, attachments, or impersonation attempts
_____
Hello everyone,
I installed the slurm 19.05.5 from Ubuntu repo, for the first time in a cluster with 44 identical nodes but I have problem with slurmctld.service
When I try to activate slurmctd I get the following message…
fatal: You are running with a database but for some reason we have no TRES from it. This should only happen if the database is down and you don't have any state files
* Ubuntu 20.04.2 runs on the server and nodes in the exact same version.
* munge 0.5.13 installed from Ubuntu repo running on server and nodes.
* mysql Ver 8.0.23-0ubuntu0.20.04.1 for Linux on x86_64 ((Ubuntu)) installed from ubuntu repo running on server.
slurm.conf is the same on all nodes and on server.
slurmd.service is active and running on all nodes without problem.
mysql.service is active and running on server.
slurmdbd.service is active and running on server (slurm_acct_db created).
Find attached slurm.conf slurmdbd.com <http://slurmdbd.com> and detailed output of slurmctld -Dvvvv command.
Any hint?
Thanks in advance
jb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210405/42f36e25/attachment.htm>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: slurmdctl -Dvvvv new.txt
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210405/42f36e25/attachment.txt>
More information about the slurm-users
mailing list