[slurm-users] [EXT] slurmctld error

Ioannis Botsis ibotsis at isc.tuc.gr
Mon Apr 5 11:00:29 UTC 2021


Hi Sean,

 

Thank you for your prompt response,  I made the changes you suggested, slurmctld refuse running……. find attached new slurmctld -Dvvvv

 

jb

 

 

 

From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Sean Crosby
Sent: Monday, April 5, 2021 11:46 AM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] [EXT] slurmctld error

 

Hi Jb,

 

You have set AccountingStoragePort to 3306 in slurm.conf, which is the MySQL port running on the DBD host.

 

AccountingStoragePort is the port for the Slurmdbd service, and not for MySQL.

 

Change AccountingStoragePort to 6819 and it should fix your issues.

 

I also think you should comment out the lines 

 

AccountingStorageUser=slurm
AccountingStoragePass=/run/munge/munge.socket.2

 

You shouldn't need those lines

 

Sean

 

--
Sean Crosby | Senior DevOpsHPC Engineer and HPC Team Lead
Research Computing Services | Business Services
The University of Melbourne, Victoria 3010 Australia

 

 

On Mon, 5 Apr 2021 at 18:03, Ioannis Botsis <ibotsis at isc.tuc.gr <mailto:ibotsis at isc.tuc.gr> > wrote:


UoM notice: External email. Be cautious of links, attachments, or impersonation attempts

 

  _____  

Hello everyone,

 

I installed the slurm 19.05.5 from Ubuntu repo,  for the first time in a cluster with 44  identical nodes but I have problem with slurmctld.service

 

When I try to activate slurmctd I get the following message…

 

fatal: You are running with a database but for some reason we have no TRES from it.  This should only happen if the database is down and you don't have any state files

 

*	Ubuntu 20.04.2 runs on the server and nodes in the exact same version.
*	munge 0.5.13 installed from Ubuntu repo running on server and nodes.
*	mysql  Ver 8.0.23-0ubuntu0.20.04.1 for Linux on x86_64 ((Ubuntu))  installed from ubuntu repo running on server.

 

slurm.conf is the same on all nodes and on server.

 

slurmd.service is active and running on all nodes without problem.

 

mysql.service is active and running on server.

slurmdbd.service is active and running on server (slurm_acct_db created).

 

Find attached slurm.conf slurmdbd.com <http://slurmdbd.com>   and detailed output of slurmctld -Dvvvv  command.

 

Any hint?

 

Thanks in advance

 

jb

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210405/42f36e25/attachment.htm>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: slurmdctl -Dvvvv new.txt
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210405/42f36e25/attachment.txt>


More information about the slurm-users mailing list