[slurm-users] slurmdbd does not work
Brian Andrus
toomuchit at gmail.com
Fri Dec 3 16:13:01 UTC 2021
You will need to also reinstall/restart slurmdbd with the updated binary.
Look in the slurmdbd logs to see what is happening there. I suspect it
had errors updating/creating the database and tables. If you have no
data in it yet, you can just DROP the database and restart slurmdbd.
Brian Andrus
On 12/3/2021 6:42 AM, Giuseppe G. A. Celano wrote:
> Thanks for the answer, Brian. I now added
> --with-mysql_config=/etc/mysql/my.cnf, but the problem is still there
> and now also slurmctld does not work, with the error:
>
> [2021-12-03T15:36:41.018] accounting_storage/slurmdbd:
> clusteracct_storage_p_register_ctld: Registering slurmctld at port
> 6817 with slurmdbd
> [2021-12-03T15:36:41.019] error: _conn_readable: persistent connection
> for fd 9 experienced error[104]: Connection reset by peer
> [2021-12-03T15:36:41.019] error: _slurm_persist_recv_msg: only read
> 150 of 2613 bytes
> [2021-12-03T15:36:41.019] error: Sending PersistInit msg: No error
> [2021-12-03T15:36:41.020] error: _conn_readable: persistent connection
> for fd 9 experienced error[104]: Connection reset by peer
> [2021-12-03T15:36:41.020] error: _slurm_persist_recv_msg: only read
> 150 of 2613 bytes
> [2021-12-03T15:36:41.020] error: Sending PersistInit msg: No error
> [2021-12-03T15:36:41.020] error: _conn_readable: persistent connection
> for fd 9 experienced error[104]: Connection reset by peer
> [2021-12-03T15:36:41.020] error: _slurm_persist_recv_msg: only read
> 150 of 2613 bytes
> [2021-12-03T15:36:41.020] error: Sending PersistInit msg: No error
> [2021-12-03T15:36:41.020] error: DBD_GET_TRES failure: No error
> [2021-12-03T15:36:41.021] error: _conn_readable: persistent connection
> for fd 9 experienced error[104]: Connection reset by peer
> [2021-12-03T15:36:41.021] error: _slurm_persist_recv_msg: only read 0
> of 2613 bytes
> [2021-12-03T15:36:41.021] error: Sending PersistInit msg: No error
> [2021-12-03T15:36:41.021] error: DBD_GET_QOS failure: No error
> [2021-12-03T15:36:41.021] error: _conn_readable: persistent connection
> for fd 9 experienced error[104]: Connection reset by peer
> [2021-12-03T15:36:41.021] error: _slurm_persist_recv_msg: only read
> 150 of 2613 bytes
> [2021-12-03T15:36:41.021] error: Sending PersistInit msg: No error
> [2021-12-03T15:36:41.021] error: DBD_GET_USERS failure: No error
> [2021-12-03T15:36:41.022] error: _conn_readable: persistent connection
> for fd 9 experienced error[104]: Connection reset by peer
> [2021-12-03T15:36:41.022] error: _slurm_persist_recv_msg: only read 0
> of 2613 bytes
> [2021-12-03T15:36:41.022] error: Sending PersistInit msg: No error
> [2021-12-03T15:36:41.022] error: DBD_GET_ASSOCS failure: No error
> [2021-12-03T15:36:41.022] error: _conn_readable: persistent connection
> for fd 9 experienced error[104]: Connection reset by peer
> [2021-12-03T15:36:41.022] error: _slurm_persist_recv_msg: only read 0
> of 2613 bytes
> [2021-12-03T15:36:41.022] error: Sending PersistInit msg: No error
> [2021-12-03T15:36:41.022] error: DBD_GET_RES failure: No error
> [2021-12-03T15:36:41.022] fatal: You are running with a database but
> for some reason we have no TRES from it. This should only happen if
> the database is down and you don't have any state files.
>
>
>
> On Thu, Dec 2, 2021 at 10:36 PM Brian Andrus <toomuchit at gmail.com> wrote:
>
>
> Your slurm needs built with the support. If you have mysql-devel
> installed it should pick it up, otherwise you can specify the
> location with --with-mysql when you configure/build slurm
>
> Brian Andrus
>
> On 12/2/2021 12:40 PM, Giuseppe G. A. Celano wrote:
>> Hi everyone,
>>
>> I am having trouble getting /slurmdbd/ to work. This is the error
>> I get:
>>
>> /error: Couldn't find the specified plugin name for
>> accounting_storage/mysql looking at all files
>> error: cannot find accounting_storage plugin for
>> accounting_storage/mysql
>> error: cannot create accounting_storage context for
>> accounting_storage/mysql
>> fatal: Unable to initialize accounting_storage/mysql accounting
>> storage plugin/
>>
>> I have installed /mysql/ (/apt install mysql/) on Ubuntu 20.04.03
>> and followed the instructions on the slurm website
>> <https://slurm.schedmd.com/accounting.html>; /mysql/ is running
>> (/port 3306/) and these are the relevant parts in my /.conf/ files:
>>
>> /slurm.conf/
>>
>> # LOGGING AND ACCOUNTING
>> AccountingStorageHost=localhost
>> AccountingStoragePort=3306
>> AccountingStorageType=accounting_storage/slurmdbd
>> AccountingStorageUser=slurm
>> JobCompType=jobcomp/none
>> JobAcctGatherFrequency=30
>> JobAcctGatherType=jobacct_gather/linux
>> SlurmctldDebug=info
>> SlurmctldLogFile=/var/log/slurmctld.log
>> SlurmdDebug=info
>> SlurmdLogFile=/var/log/slurmd.log
>>
>> /slurmdbd.conf/
>>
>> AuthType=auth/munge
>> DbdAddr=localhost
>> DbdHost=localhost
>> DbdPort=3306
>> LogFile=/var/log/slurmdbd.log
>> PidFile=/var/run/slurmdbd.pid
>> PluginDir=/usr/lib/slurm
>> SlurmUser=slurm
>> StoragePass=password
>> StorageType=accounting_storage/mysql
>> StorageUser=slurm
>> StorageLoc=slurm_acct_db
>>
>> I changed the port to 3306 because otherwise /slurmdbd /could not
>> communicate with /mysql/. If I run /sacct/, for example, I get:
>>
>> /sacct: error: _slurm_persist_recv_msg: read of fd 3 failed: No error
>> sacct: error: _slurm_persist_recv_msg: only read 126 of 2616 bytes
>> sacct: error: slurm_persist_conn_open: No response to persist_init
>> sacct: error: Sending PersistInit msg: No error
>> JobID JobName Partition Account AllocCPUS
>> State ExitCode
>> ------------ ---------- ---------- ---------- ----------
>> ---------- --------
>> sacct: error: _slurm_persist_recv_msg: read of fd 3 failed: No error
>> sacct: error: _slurm_persist_recv_msg: only read 126 of 2616 bytes
>> sacct: error: Sending PersistInit msg: No error
>> sacct: error: DBD_GET_JOBS_COND failure: Unspecified error/
>> /
>> /
>> Does anyone have a suggestion to solve this problem? Thank you
>> very much.
>>
>> Best,
>> Giuseppe
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20211203/d02f6a48/attachment.htm>
More information about the slurm-users
mailing list