[slurm-users] slurmdbd purge not working

Julien Rey julien.rey at univ-paris-diderot.fr
Fri Apr 5 13:05:07 UTC 2019


Hi Paul, thanks for your advice. Actually I already tried what you 
suggested. No matter what value do I put after PurgeJobAfter I always 
end up with the same error:

sacctmgr archive dump Directory=/home/joule/archives/ PurgeJobAfter=1days
sacctmgr: error: slurmdbd: Getting response to message type 1459
sacctmgr: error: slurmdbd: DBD_ARCHIVE_DUMP failure: No error
  Problem dumping archive: Unspecified error

sacctmgr archive dump Directory=/home/joule/archives/ PurgeJobAfter=48months
sacctmgr: error: slurmdbd: Getting response to message type 1459
sacctmgr: error: slurmdbd: DBD_ARCHIVE_DUMP failure: No error
  Problem dumping archive: Unspecified error

Has anyone tried to truncate tables by hand directly in the mysql 
command line ?

Le 04/04/2019 16:13, Paul Edmon a écrit :
> We ran into this problem in the past.  I know that fixes were put in 
> to deal with large purges as a result of our problems but I don't 
> recall what version they ended up in, likely newer than 15.08.0.
>
> A solution that can work is to walk up the time so that instead of one 
> large purge you do several smaller purges.  That at least worked for 
> us in the past.
>
> -Paul Edmon-
>
> On 4/4/19 9:38 AM, Julien Rey wrote:
>> Hello,
>>
>> Our slurm accounting database is growing bigger and bigger (more than 
>> 100Gb) and is never being purged. We are running slurm 15.08.0-0pre1. 
>> I would like to upgrade to a more recent version of the slurmdbd, but 
>> my fear is that it may break everything during the update of the 
>> database.
>>
>> Here is our slurmdbd.conf :
>>
>> AuthType=auth/munge
>> AuthInfo=/var/run/munge/munge.socket.2
>> DbdHost=localhost
>> DebugLevel=6
>> StorageHost=localhost
>> StorageLoc=slurm_acct_db
>> StoragePass=shazaam
>> StorageType=accounting_storage/mysql
>> StorageUser=slurm
>> LogFile=/var/log/slurm-llnl/slurmdbd.log
>> PidFile=/var/run/slurm-llnl/slurmdbd.pid
>> SlurmUser=slurm
>> ArchiveDir=/home/joule/archives
>> PurgeEventAfter=18
>> PurgeJobAfter=18
>> PurgeResvAfter=1
>> PurgeStepAfter=1
>> PurgeSuspendAfter=1
>>
>> I tried to purge it manually using this command but the slurmdbd 
>> daemon ends up crashing and it doesn't remove anything:
>>
>> sacctmgr archive dump Directory=/home/joule/archives/ 
>> PurgeJobAfter=365days
>>
>> sacctmgr: error: slurmdbd: Getting response to message type 1459
>> sacctmgr: error: slurmdbd: DBD_ARCHIVE_DUMP failure: No error
>>  Problem dumping archive: Unspecified error
>>
>> Sometimes I have to restart the mysql daemon (we are running mysql 
>> 5.5.39-1). The /var/log/slurm-llnl/slurmdbd.log shows nothings. The 
>> mysql logs are empty.
>>
>> I tried to increase these values in my.cnf but so far no success :
>>
>> innodb_buffer_pool_size        = 32G
>> innodb_lock_wait_timeout    = 3600
>>
>> Is there any way to solve this issue ? Otherwise, what would be the 
>> procedure for deleting the database records altogether and starting 
>> on a fresh new one ?
>>
>> Thanks in advance.
>> -- 
>> Julien REY
>>
>> Plate-forme RPBS
>> Modélisation Computationnelle des Interactions Protéines-Ligand (CMPLI)
>> Université Paris Diderot - Paris VII
>> tel : 01 57 27 83 95
>


-- 
Julien REY

Plate-forme RPBS
Modélisation Computationnelle des Interactions Protéines-Ligand (CMPLI)
Université Paris Diderot - Paris VII
tel : 01 57 27 83 95




More information about the slurm-users mailing list