[slurm-users] [EXT] Re: slurmdbd does not work

Giuseppe G. A. Celano giuseppegacelano at gmail.com
Sat Dec 4 01:30:40 UTC 2021


I have installed almost all of the possible packages, but that file doesn't
show up:

libdbd-mariadb-perl/focal,now 1.11-3ubuntu2 amd64 [installed]
libmariadb-dev-compat/unknown,now 1:10.4.22+maria~focal amd64 [installed]
libmariadb-dev/unknown,now 1:10.4.22+maria~focal amd64 [installed]
libmariadb3-compat/unknown,now 1:10.4.22+maria~focal amd64 [installed]
libmariadb3/unknown,now 1:10.4.22+maria~focal amd64 [installed,automatic]
libmariadbclient18/unknown,now 1:10.4.22+maria~focal amd64 [installed]
libmariadbd-dev/unknown,now 1:10.4.22+maria~focal amd64 [installed]
libmariadbd19/unknown,now 1:10.4.22+maria~focal amd64 [installed]
mariadb-client-10.4/unknown,now 1:10.4.22+maria~focal amd64
[installed,automatic]
mariadb-client-core-10.4/unknown,now 1:10.4.22+maria~focal amd64 [installed]
mariadb-client/unknown,unknown,unknown,now 1:10.4.22+maria~focal all
[installed]
mariadb-common/unknown,unknown,unknown,now 1:10.4.22+maria~focal all
[installed]
mariadb-plugin-connect/unknown,now 1:10.4.22+maria~focal amd64 [installed]
mariadb-server-10.4/unknown,now 1:10.4.22+maria~focal amd64 [installed]
mariadb-server-core-10.4/unknown,now 1:10.4.22+maria~focal amd64 [installed]
mariadb-server/unknown,unknown,unknown,now 1:10.4.22+maria~focal all
[installed]
odbc-mariadb/focal,now 3.1.4-1 amd64 [installed]


On Sat, Dec 4, 2021 at 2:06 AM Sean Crosby <scrosby at unimelb.edu.au> wrote:

> Try installing the libmariadb-dev-compat package and trying the
> configure/make again. It provides "libmysqlclient.so", whereas
> libmariadb-dev provides "libmariadb.so"
> ------------------------------
> *From:* slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of
> Giuseppe G. A. Celano <giuseppegacelano at gmail.com>
> *Sent:* Saturday, 4 December 2021 11:40
> *To:* Slurm User Community List <slurm-users at lists.schedmd.com>
> *Subject:* Re: [slurm-users] [EXT] Re: slurmdbd does not work
>
> * External email: Please exercise caution *
> ------------------------------
> 10.4.22
>
>
> On Sat, Dec 4, 2021 at 1:35 AM Brian Andrus <toomuchit at gmail.com> wrote:
>
> Which version of Mariadb are you using?
>
> Brian Andrus
> On 12/3/2021 4:20 PM, Giuseppe G. A. Celano wrote:
>
> After installation of libmariadb-dev, I have reinstalled the entire slurm
> with ./configure + options, make, and make install. Still,
> accounting_storage_mysql.so is missing.
>
>
>
> On Sat, Dec 4, 2021 at 12:24 AM Sean Crosby <scrosby at unimelb.edu.au>
> wrote:
>
> Did you run
>
> ./configure (with any other options you normally use)
> make
> make install
>
> on your DBD server after you installed the mariadb-devel package?
>
> ------------------------------
> *From:* slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of
> Giuseppe G. A. Celano <giuseppegacelano at gmail.com>
> *Sent:* Saturday, 4 December 2021 10:07
> *To:* Slurm User Community List <slurm-users at lists.schedmd.com>
> *Subject:* [EXT] Re: [slurm-users] slurmdbd does not work
>
> * External email: Please exercise caution *
> ------------------------------
> The problem is the lack of /usr/lib/slurm/accounting_storage_mysql.so
>
> I have installed many mariadb-related packages, but that file is not
> created by slurm after installation: is there a point in the documentation
> where the installation procedure for the database is made explicit?
>
>
>
> On Fri, Dec 3, 2021 at 5:15 PM Brian Andrus <toomuchit at gmail.com> wrote:
>
> You will need to also reinstall/restart slurmdbd with the updated binary.
>
> Look in the slurmdbd logs to see what is happening there. I suspect it had
> errors updating/creating the database and tables. If you have no data in it
> yet, you can just DROP the database and restart slurmdbd.
>
> Brian Andrus
> On 12/3/2021 6:42 AM, Giuseppe G. A. Celano wrote:
>
> Thanks for the answer, Brian. I now added
> --with-mysql_config=/etc/mysql/my.cnf, but the problem is still there and
> now also slurmctld does not work, with the error:
>
> [2021-12-03T15:36:41.018] accounting_storage/slurmdbd:
> clusteracct_storage_p_register_ctld: Registering slurmctld at port 6817
> with slurmdbd
> [2021-12-03T15:36:41.019] error: _conn_readable: persistent connection for
> fd 9 experienced error[104]: Connection reset by peer
> [2021-12-03T15:36:41.019] error: _slurm_persist_recv_msg: only read 150 of
> 2613 bytes
> [2021-12-03T15:36:41.019] error: Sending PersistInit msg: No error
> [2021-12-03T15:36:41.020] error: _conn_readable: persistent connection for
> fd 9 experienced error[104]: Connection reset by peer
> [2021-12-03T15:36:41.020] error: _slurm_persist_recv_msg: only read 150 of
> 2613 bytes
> [2021-12-03T15:36:41.020] error: Sending PersistInit msg: No error
> [2021-12-03T15:36:41.020] error: _conn_readable: persistent connection for
> fd 9 experienced error[104]: Connection reset by peer
> [2021-12-03T15:36:41.020] error: _slurm_persist_recv_msg: only read 150 of
> 2613 bytes
> [2021-12-03T15:36:41.020] error: Sending PersistInit msg: No error
> [2021-12-03T15:36:41.020] error: DBD_GET_TRES failure: No error
> [2021-12-03T15:36:41.021] error: _conn_readable: persistent connection for
> fd 9 experienced error[104]: Connection reset by peer
> [2021-12-03T15:36:41.021] error: _slurm_persist_recv_msg: only read 0 of
> 2613 bytes
> [2021-12-03T15:36:41.021] error: Sending PersistInit msg: No error
> [2021-12-03T15:36:41.021] error: DBD_GET_QOS failure: No error
> [2021-12-03T15:36:41.021] error: _conn_readable: persistent connection for
> fd 9 experienced error[104]: Connection reset by peer
> [2021-12-03T15:36:41.021] error: _slurm_persist_recv_msg: only read 150 of
> 2613 bytes
> [2021-12-03T15:36:41.021] error: Sending PersistInit msg: No error
> [2021-12-03T15:36:41.021] error: DBD_GET_USERS failure: No error
> [2021-12-03T15:36:41.022] error: _conn_readable: persistent connection for
> fd 9 experienced error[104]: Connection reset by peer
> [2021-12-03T15:36:41.022] error: _slurm_persist_recv_msg: only read 0 of
> 2613 bytes
> [2021-12-03T15:36:41.022] error: Sending PersistInit msg: No error
> [2021-12-03T15:36:41.022] error: DBD_GET_ASSOCS failure: No error
> [2021-12-03T15:36:41.022] error: _conn_readable: persistent connection for
> fd 9 experienced error[104]: Connection reset by peer
> [2021-12-03T15:36:41.022] error: _slurm_persist_recv_msg: only read 0 of
> 2613 bytes
> [2021-12-03T15:36:41.022] error: Sending PersistInit msg: No error
> [2021-12-03T15:36:41.022] error: DBD_GET_RES failure: No error
> [2021-12-03T15:36:41.022] fatal: You are running with a database but for
> some reason we have no TRES from it.  This should only happen if the
> database is down and you don't have any state files.
>
>
>
> On Thu, Dec 2, 2021 at 10:36 PM Brian Andrus <toomuchit at gmail.com> wrote:
>
>
> Your slurm needs built with the support. If you have mysql-devel installed
> it should pick it up, otherwise you can specify the location with
> --with-mysql when you configure/build slurm
>
> Brian Andrus
> On 12/2/2021 12:40 PM, Giuseppe G. A. Celano wrote:
>
> Hi everyone,
>
> I am having trouble getting * slurmdbd* to work. This is the error I get:
>
>
>
>
> *error: Couldn't find the specified plugin name for
> accounting_storage/mysql looking at all files error: cannot find
> accounting_storage plugin for accounting_storage/mysql error: cannot create
> accounting_storage context for accounting_storage/mysql fatal: Unable to
> initialize accounting_storage/mysql accounting storage plugin*
>
> I have installed *mysql* (*apt install mysql*) on Ubuntu 20.04.03 and
> followed the instructions on the slurm website
> <https://slurm.schedmd.com/accounting.html>; * mysql* is running (*port
> 3306*) and these are the relevant parts in my * .conf* files:
>
> *slurm.conf*
>
> # LOGGING AND ACCOUNTING
> AccountingStorageHost=localhost
> AccountingStoragePort=3306
> AccountingStorageType=accounting_storage/slurmdbd
> AccountingStorageUser=slurm
> JobCompType=jobcomp/none
> JobAcctGatherFrequency=30
> JobAcctGatherType=jobacct_gather/linux
> SlurmctldDebug=info
> SlurmctldLogFile=/var/log/slurmctld.log
> SlurmdDebug=info
> SlurmdLogFile=/var/log/slurmd.log
>
> *slurmdbd.conf*
>
> AuthType=auth/munge
> DbdAddr=localhost
> DbdHost=localhost
> DbdPort=3306
> LogFile=/var/log/slurmdbd.log
> PidFile=/var/run/slurmdbd.pid
> PluginDir=/usr/lib/slurm
> SlurmUser=slurm
> StoragePass=password
> StorageType=accounting_storage/mysql
> StorageUser=slurm
> StorageLoc=slurm_acct_db
>
> I changed the port to 3306 because otherwise *slurmdbd *could not
> communicate with *mysql*. If I run *sacct*, for example, I get:
>
>
>
>
>
>
>
>
>
>
> *sacct: error: _slurm_persist_recv_msg: read of fd 3 failed: No error
> sacct: error: _slurm_persist_recv_msg: only read 126 of 2616 bytes sacct:
> error: slurm_persist_conn_open: No response to persist_init sacct: error:
> Sending PersistInit msg: No error JobID           JobName  Partition
>  Account  AllocCPUS      State ExitCode ------------ ---------- ----------
> ---------- ---------- ---------- -------- sacct: error:
> _slurm_persist_recv_msg: read of fd 3 failed: No error sacct: error:
> _slurm_persist_recv_msg: only read 126 of 2616 bytes sacct: error: Sending
> PersistInit msg: No error sacct: error: DBD_GET_JOBS_COND failure:
> Unspecified error*
>
> Does anyone have a suggestion to solve this problem? Thank you very much.
>
> Best,
> Giuseppe
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20211204/0621bb2d/attachment-0001.htm>


More information about the slurm-users mailing list