[slurm-users] [EXT] Re: slurmdbd does not work

Brian Andrus toomuchit at gmail.com
Sat Dec 4 00:33:11 UTC 2021


Which version of Mariadb are you using?

Brian Andrus

On 12/3/2021 4:20 PM, Giuseppe G. A. Celano wrote:
> After installation of libmariadb-dev, I have reinstalled the entire 
> slurm with ./configure + options, make, and make install. Still, 
> accounting_storage_mysql.so is missing.
>
>
>
> On Sat, Dec 4, 2021 at 12:24 AM Sean Crosby <scrosby at unimelb.edu.au> 
> wrote:
>
>     Did you run
>
>     ./configure (with any other options you normally use)
>     make
>     make install
>
>     on your DBD server after you installed the mariadb-devel package?
>
>     ------------------------------------------------------------------------
>     *From:* slurm-users <slurm-users-bounces at lists.schedmd.com> on
>     behalf of Giuseppe G. A. Celano <giuseppegacelano at gmail.com>
>     *Sent:* Saturday, 4 December 2021 10:07
>     *To:* Slurm User Community List <slurm-users at lists.schedmd.com>
>     *Subject:* [EXT] Re: [slurm-users] slurmdbd does not work
>     *
>     *External email: *Please exercise caution
>
>     *
>     ------------------------------------------------------------------------
>     The problem is the lack of /usr/lib/slurm/accounting_storage_mysql.so
>
>     I have installed many mariadb-related packages, but that file is
>     not created by slurm after installation: is there a point in the
>     documentation where the installation procedure for the database is
>     made explicit?
>
>
>
>     On Fri, Dec 3, 2021 at 5:15 PM Brian Andrus <toomuchit at gmail.com>
>     wrote:
>
>         You will need to also reinstall/restart slurmdbd with the
>         updated binary.
>
>         Look in the slurmdbd logs to see what is happening there. I
>         suspect it had errors updating/creating the database and
>         tables. If you have no data in it yet, you can just DROP the
>         database and restart slurmdbd.
>
>         Brian Andrus
>
>         On 12/3/2021 6:42 AM, Giuseppe G. A. Celano wrote:
>>         Thanks for the answer, Brian. I now added
>>         --with-mysql_config=/etc/mysql/my.cnf, but the problem is
>>         still there and now also slurmctld does not work, with the error:
>>
>>         [2021-12-03T15:36:41.018] accounting_storage/slurmdbd:
>>         clusteracct_storage_p_register_ctld: Registering slurmctld at
>>         port 6817 with slurmdbd
>>         [2021-12-03T15:36:41.019] error: _conn_readable: persistent
>>         connection for fd 9 experienced error[104]: Connection reset
>>         by peer
>>         [2021-12-03T15:36:41.019] error: _slurm_persist_recv_msg:
>>         only read 150 of 2613 bytes
>>         [2021-12-03T15:36:41.019] error: Sending PersistInit msg: No
>>         error
>>         [2021-12-03T15:36:41.020] error: _conn_readable: persistent
>>         connection for fd 9 experienced error[104]: Connection reset
>>         by peer
>>         [2021-12-03T15:36:41.020] error: _slurm_persist_recv_msg:
>>         only read 150 of 2613 bytes
>>         [2021-12-03T15:36:41.020] error: Sending PersistInit msg: No
>>         error
>>         [2021-12-03T15:36:41.020] error: _conn_readable: persistent
>>         connection for fd 9 experienced error[104]: Connection reset
>>         by peer
>>         [2021-12-03T15:36:41.020] error: _slurm_persist_recv_msg:
>>         only read 150 of 2613 bytes
>>         [2021-12-03T15:36:41.020] error: Sending PersistInit msg: No
>>         error
>>         [2021-12-03T15:36:41.020] error: DBD_GET_TRES failure: No error
>>         [2021-12-03T15:36:41.021] error: _conn_readable: persistent
>>         connection for fd 9 experienced error[104]: Connection reset
>>         by peer
>>         [2021-12-03T15:36:41.021] error: _slurm_persist_recv_msg:
>>         only read 0 of 2613 bytes
>>         [2021-12-03T15:36:41.021] error: Sending PersistInit msg: No
>>         error
>>         [2021-12-03T15:36:41.021] error: DBD_GET_QOS failure: No error
>>         [2021-12-03T15:36:41.021] error: _conn_readable: persistent
>>         connection for fd 9 experienced error[104]: Connection reset
>>         by peer
>>         [2021-12-03T15:36:41.021] error: _slurm_persist_recv_msg:
>>         only read 150 of 2613 bytes
>>         [2021-12-03T15:36:41.021] error: Sending PersistInit msg: No
>>         error
>>         [2021-12-03T15:36:41.021] error: DBD_GET_USERS failure: No error
>>         [2021-12-03T15:36:41.022] error: _conn_readable: persistent
>>         connection for fd 9 experienced error[104]: Connection reset
>>         by peer
>>         [2021-12-03T15:36:41.022] error: _slurm_persist_recv_msg:
>>         only read 0 of 2613 bytes
>>         [2021-12-03T15:36:41.022] error: Sending PersistInit msg: No
>>         error
>>         [2021-12-03T15:36:41.022] error: DBD_GET_ASSOCS failure: No error
>>         [2021-12-03T15:36:41.022] error: _conn_readable: persistent
>>         connection for fd 9 experienced error[104]: Connection reset
>>         by peer
>>         [2021-12-03T15:36:41.022] error: _slurm_persist_recv_msg:
>>         only read 0 of 2613 bytes
>>         [2021-12-03T15:36:41.022] error: Sending PersistInit msg: No
>>         error
>>         [2021-12-03T15:36:41.022] error: DBD_GET_RES failure: No error
>>         [2021-12-03T15:36:41.022] fatal: You are running with a
>>         database but for some reason we have no TRES from it.  This
>>         should only happen if the database is down and you don't have
>>         any state files.
>>
>>
>>
>>         On Thu, Dec 2, 2021 at 10:36 PM Brian Andrus
>>         <toomuchit at gmail.com> wrote:
>>
>>
>>             Your slurm needs built with the support. If you have
>>             mysql-devel installed it should pick it up, otherwise you
>>             can specify the location with --with-mysql when you
>>             configure/build slurm
>>
>>             Brian Andrus
>>
>>             On 12/2/2021 12:40 PM, Giuseppe G. A. Celano wrote:
>>>             Hi everyone,
>>>
>>>             I am having trouble getting /slurmdbd/ to work. This is
>>>             the error I get:
>>>
>>>             /error: Couldn't find the specified plugin name for
>>>             accounting_storage/mysql looking at all files
>>>             error: cannot find accounting_storage plugin for
>>>             accounting_storage/mysql
>>>             error: cannot create accounting_storage context for
>>>             accounting_storage/mysql
>>>             fatal: Unable to initialize accounting_storage/mysql
>>>             accounting storage plugin/
>>>
>>>             I have installed /mysql/ (/apt install mysql/) on Ubuntu
>>>             20.04.03 and followed the instructions on the slurm
>>>             website <https://slurm.schedmd.com/accounting.html>;
>>>             /mysql/ is running (/port 3306/) and these are the
>>>             relevant parts in my /.conf/ files:
>>>
>>>             /slurm.conf/
>>>
>>>             # LOGGING AND ACCOUNTING
>>>             AccountingStorageHost=localhost
>>>             AccountingStoragePort=3306
>>>             AccountingStorageType=accounting_storage/slurmdbd
>>>             AccountingStorageUser=slurm
>>>             JobCompType=jobcomp/none
>>>             JobAcctGatherFrequency=30
>>>             JobAcctGatherType=jobacct_gather/linux
>>>             SlurmctldDebug=info
>>>             SlurmctldLogFile=/var/log/slurmctld.log
>>>             SlurmdDebug=info
>>>             SlurmdLogFile=/var/log/slurmd.log
>>>
>>>             /slurmdbd.conf/
>>>
>>>             AuthType=auth/munge
>>>             DbdAddr=localhost
>>>             DbdHost=localhost
>>>             DbdPort=3306
>>>             LogFile=/var/log/slurmdbd.log
>>>             PidFile=/var/run/slurmdbd.pid
>>>             PluginDir=/usr/lib/slurm
>>>             SlurmUser=slurm
>>>             StoragePass=password
>>>             StorageType=accounting_storage/mysql
>>>             StorageUser=slurm
>>>             StorageLoc=slurm_acct_db
>>>
>>>             I changed the port to 3306 because otherwise /slurmdbd
>>>             /could not communicate with /mysql/. If I run /sacct/,
>>>             for example, I get:
>>>
>>>             /sacct: error: _slurm_persist_recv_msg: read of fd 3
>>>             failed: No error
>>>             sacct: error: _slurm_persist_recv_msg: only read 126 of
>>>             2616 bytes
>>>             sacct: error: slurm_persist_conn_open: No response to
>>>             persist_init
>>>             sacct: error: Sending PersistInit msg: No error
>>>             JobID           JobName  Partition    Account  AllocCPUS
>>>                  State ExitCode
>>>             ------------ ---------- ---------- ---------- ----------
>>>             ---------- --------
>>>             sacct: error: _slurm_persist_recv_msg: read of fd 3
>>>             failed: No error
>>>             sacct: error: _slurm_persist_recv_msg: only read 126 of
>>>             2616 bytes
>>>             sacct: error: Sending PersistInit msg: No error
>>>             sacct: error: DBD_GET_JOBS_COND failure: Unspecified error/
>>>             /
>>>             /
>>>             Does anyone have a suggestion to solve this problem?
>>>             Thank you very much.
>>>
>>>             Best,
>>>             Giuseppe
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20211203/4e2fc703/attachment-0001.htm>


More information about the slurm-users mailing list