[slurm-users] slurmdbd purge not working
Julien Rey
julien.rey at univ-paris-diderot.fr
Fri Apr 5 13:05:07 UTC 2019
Hi Paul, thanks for your advice. Actually I already tried what you
suggested. No matter what value do I put after PurgeJobAfter I always
end up with the same error:
sacctmgr archive dump Directory=/home/joule/archives/ PurgeJobAfter=1days
sacctmgr: error: slurmdbd: Getting response to message type 1459
sacctmgr: error: slurmdbd: DBD_ARCHIVE_DUMP failure: No error
Problem dumping archive: Unspecified error
sacctmgr archive dump Directory=/home/joule/archives/ PurgeJobAfter=48months
sacctmgr: error: slurmdbd: Getting response to message type 1459
sacctmgr: error: slurmdbd: DBD_ARCHIVE_DUMP failure: No error
Problem dumping archive: Unspecified error
Has anyone tried to truncate tables by hand directly in the mysql
command line ?
Le 04/04/2019 16:13, Paul Edmon a écrit :
> We ran into this problem in the past. I know that fixes were put in
> to deal with large purges as a result of our problems but I don't
> recall what version they ended up in, likely newer than 15.08.0.
>
> A solution that can work is to walk up the time so that instead of one
> large purge you do several smaller purges. That at least worked for
> us in the past.
>
> -Paul Edmon-
>
> On 4/4/19 9:38 AM, Julien Rey wrote:
>> Hello,
>>
>> Our slurm accounting database is growing bigger and bigger (more than
>> 100Gb) and is never being purged. We are running slurm 15.08.0-0pre1.
>> I would like to upgrade to a more recent version of the slurmdbd, but
>> my fear is that it may break everything during the update of the
>> database.
>>
>> Here is our slurmdbd.conf :
>>
>> AuthType=auth/munge
>> AuthInfo=/var/run/munge/munge.socket.2
>> DbdHost=localhost
>> DebugLevel=6
>> StorageHost=localhost
>> StorageLoc=slurm_acct_db
>> StoragePass=shazaam
>> StorageType=accounting_storage/mysql
>> StorageUser=slurm
>> LogFile=/var/log/slurm-llnl/slurmdbd.log
>> PidFile=/var/run/slurm-llnl/slurmdbd.pid
>> SlurmUser=slurm
>> ArchiveDir=/home/joule/archives
>> PurgeEventAfter=18
>> PurgeJobAfter=18
>> PurgeResvAfter=1
>> PurgeStepAfter=1
>> PurgeSuspendAfter=1
>>
>> I tried to purge it manually using this command but the slurmdbd
>> daemon ends up crashing and it doesn't remove anything:
>>
>> sacctmgr archive dump Directory=/home/joule/archives/
>> PurgeJobAfter=365days
>>
>> sacctmgr: error: slurmdbd: Getting response to message type 1459
>> sacctmgr: error: slurmdbd: DBD_ARCHIVE_DUMP failure: No error
>> Problem dumping archive: Unspecified error
>>
>> Sometimes I have to restart the mysql daemon (we are running mysql
>> 5.5.39-1). The /var/log/slurm-llnl/slurmdbd.log shows nothings. The
>> mysql logs are empty.
>>
>> I tried to increase these values in my.cnf but so far no success :
>>
>> innodb_buffer_pool_size = 32G
>> innodb_lock_wait_timeout = 3600
>>
>> Is there any way to solve this issue ? Otherwise, what would be the
>> procedure for deleting the database records altogether and starting
>> on a fresh new one ?
>>
>> Thanks in advance.
>> --
>> Julien REY
>>
>> Plate-forme RPBS
>> Modélisation Computationnelle des Interactions Protéines-Ligand (CMPLI)
>> Université Paris Diderot - Paris VII
>> tel : 01 57 27 83 95
>
--
Julien REY
Plate-forme RPBS
Modélisation Computationnelle des Interactions Protéines-Ligand (CMPLI)
Université Paris Diderot - Paris VII
tel : 01 57 27 83 95
More information about the slurm-users
mailing list