[slurm-users] slurmdbd does not work

Giuseppe G. A. Celano giuseppegacelano at gmail.com
Fri Dec 3 14:42:45 UTC 2021


Thanks for the answer, Brian. I now added
--with-mysql_config=/etc/mysql/my.cnf, but the problem is still there and
now also slurmctld does not work, with the error:

[2021-12-03T15:36:41.018] accounting_storage/slurmdbd:
clusteracct_storage_p_register_ctld: Registering slurmctld at port 6817
with slurmdbd
[2021-12-03T15:36:41.019] error: _conn_readable: persistent connection for
fd 9 experienced error[104]: Connection reset by peer
[2021-12-03T15:36:41.019] error: _slurm_persist_recv_msg: only read 150 of
2613 bytes
[2021-12-03T15:36:41.019] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.020] error: _conn_readable: persistent connection for
fd 9 experienced error[104]: Connection reset by peer
[2021-12-03T15:36:41.020] error: _slurm_persist_recv_msg: only read 150 of
2613 bytes
[2021-12-03T15:36:41.020] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.020] error: _conn_readable: persistent connection for
fd 9 experienced error[104]: Connection reset by peer
[2021-12-03T15:36:41.020] error: _slurm_persist_recv_msg: only read 150 of
2613 bytes
[2021-12-03T15:36:41.020] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.020] error: DBD_GET_TRES failure: No error
[2021-12-03T15:36:41.021] error: _conn_readable: persistent connection for
fd 9 experienced error[104]: Connection reset by peer
[2021-12-03T15:36:41.021] error: _slurm_persist_recv_msg: only read 0 of
2613 bytes
[2021-12-03T15:36:41.021] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.021] error: DBD_GET_QOS failure: No error
[2021-12-03T15:36:41.021] error: _conn_readable: persistent connection for
fd 9 experienced error[104]: Connection reset by peer
[2021-12-03T15:36:41.021] error: _slurm_persist_recv_msg: only read 150 of
2613 bytes
[2021-12-03T15:36:41.021] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.021] error: DBD_GET_USERS failure: No error
[2021-12-03T15:36:41.022] error: _conn_readable: persistent connection for
fd 9 experienced error[104]: Connection reset by peer
[2021-12-03T15:36:41.022] error: _slurm_persist_recv_msg: only read 0 of
2613 bytes
[2021-12-03T15:36:41.022] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.022] error: DBD_GET_ASSOCS failure: No error
[2021-12-03T15:36:41.022] error: _conn_readable: persistent connection for
fd 9 experienced error[104]: Connection reset by peer
[2021-12-03T15:36:41.022] error: _slurm_persist_recv_msg: only read 0 of
2613 bytes
[2021-12-03T15:36:41.022] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.022] error: DBD_GET_RES failure: No error
[2021-12-03T15:36:41.022] fatal: You are running with a database but for
some reason we have no TRES from it.  This should only happen if the
database is down and you don't have any state files.



On Thu, Dec 2, 2021 at 10:36 PM Brian Andrus <toomuchit at gmail.com> wrote:

>
> Your slurm needs built with the support. If you have mysql-devel installed
> it should pick it up, otherwise you can specify the location with
> --with-mysql when you configure/build slurm
>
> Brian Andrus
> On 12/2/2021 12:40 PM, Giuseppe G. A. Celano wrote:
>
> Hi everyone,
>
> I am having trouble getting *slurmdbd* to work. This is the error I get:
>
>
>
>
> *error: Couldn't find the specified plugin name for
> accounting_storage/mysql looking at all files error: cannot find
> accounting_storage plugin for accounting_storage/mysql error: cannot create
> accounting_storage context for accounting_storage/mysql fatal: Unable to
> initialize accounting_storage/mysql accounting storage plugin*
>
> I have installed *mysql* (*apt install mysql*) on Ubuntu 20.04.03 and
> followed the instructions on the slurm website
> <https://slurm.schedmd.com/accounting.html>; *mysql* is running (*port
> 3306*) and these are the relevant parts in my *.conf* files:
>
> *slurm.conf*
>
> # LOGGING AND ACCOUNTING
> AccountingStorageHost=localhost
> AccountingStoragePort=3306
> AccountingStorageType=accounting_storage/slurmdbd
> AccountingStorageUser=slurm
> JobCompType=jobcomp/none
> JobAcctGatherFrequency=30
> JobAcctGatherType=jobacct_gather/linux
> SlurmctldDebug=info
> SlurmctldLogFile=/var/log/slurmctld.log
> SlurmdDebug=info
> SlurmdLogFile=/var/log/slurmd.log
>
> *slurmdbd.conf*
>
> AuthType=auth/munge
> DbdAddr=localhost
> DbdHost=localhost
> DbdPort=3306
> LogFile=/var/log/slurmdbd.log
> PidFile=/var/run/slurmdbd.pid
> PluginDir=/usr/lib/slurm
> SlurmUser=slurm
> StoragePass=password
> StorageType=accounting_storage/mysql
> StorageUser=slurm
> StorageLoc=slurm_acct_db
>
> I changed the port to 3306 because otherwise *slurmdbd *could not
> communicate with *mysql*. If I run *sacct*, for example, I get:
>
>
>
>
>
>
>
>
>
>
> *sacct: error: _slurm_persist_recv_msg: read of fd 3 failed: No error
> sacct: error: _slurm_persist_recv_msg: only read 126 of 2616 bytes sacct:
> error: slurm_persist_conn_open: No response to persist_init sacct: error:
> Sending PersistInit msg: No error JobID           JobName  Partition
>  Account  AllocCPUS      State ExitCode ------------ ---------- ----------
> ---------- ---------- ---------- -------- sacct: error:
> _slurm_persist_recv_msg: read of fd 3 failed: No error sacct: error:
> _slurm_persist_recv_msg: only read 126 of 2616 bytes sacct: error: Sending
> PersistInit msg: No error sacct: error: DBD_GET_JOBS_COND failure:
> Unspecified error*
>
> Does anyone have a suggestion to solve this problem? Thank you very much.
>
> Best,
> Giuseppe
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20211203/e6b8b407/attachment-0001.htm>


More information about the slurm-users mailing list