<div dir="ltr"><div class="gmail_default" style="font-family:monospace,monospace">We ran into this issue trying to move from 16.05.3 -> 17.11.7 with 1.5M records in job table.</div><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div class="gmail_default" style="font-family:monospace,monospace">In our first attempt, MySQL reported "ERROR 1206 The total number of locks exceeds the lock table size" after about 7 hours. </div><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div class="gmail_default" style="font-family:monospace,monospace">Increased InnoDB Buffer Pool size - <a href="https://dba.stackexchange.com/questions/27328/how-large-should-be-mysql-innodb-buffer-pool-size">https://dba.stackexchange.com/questions/27328/how-large-should-be-mysql-innodb-buffer-pool-size</a> - to 12G (the machine hosting mysql has 128GB) and restarted the conversion and which then completed successfully in 6.5 hours.</div><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div class="gmail_default" style="font-family:monospace,monospace">I am sure there are other MySQL tweaks that can be applied catered towards SLURM, will be useful if we can pool them together into the documentation.</div><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div class="gmail_default" style="font-family:monospace,monospace">Cheers,</div><div class="gmail_default" style="font-family:monospace,monospace">Roshan </div><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div class="gmail_default" style="font-family:monospace,monospace"><br></div></div><br><div class="gmail_quote"><div dir="ltr">On Wed, 21 Feb 2018 at 23:59, Christopher Benjamin Coffey <<a href="mailto:Chris.Coffey@nau.edu">Chris.Coffey@nau.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello,<br>
<br>
We have been trying to upgrade slurm on our cluster from 16.05.6 to 17.11.3. I'm thinking this should be doable? Past upgrades have been a breeze, and I believe during the last one, the db upgrade took like 25 minutes. Well now, the db upgrade process is taking far too long. We previously attempted the upgrade during a maintenance window and the upgrade process did not complete after 24 hrs. I gave up on the upgrade and reverted the slurm version back by restoring a backup db.<br>
<br>
Since the failed attempt at the upgrade, I've archived a bunch of jobs as we had 4 years of jobs in the database. Now only keeping last 1.5 years worth. This reduced our db size down from 3.7GB to 1.1GB. We are now archiving jobs regularly through slurm.<br>
<br>
I've finally had time to look at this a bit more and we've restored the reduced database onto another system to test the upgrade process in a dev environment, hoping to prove that the slimmed down db will upgrade within a reasonable amount of time. Yet, the current upgrade on this dev system has already taken 20 hrs. The database has 1.8M jobs. That doesn't seem like that many jobs!<br>
<br>
The conversion is stuck on this command:<br>
<br>
update "monsoon_job_table" as job left outer join ( select job_db_inx, SUM(consumed_energy) 'sum_energy' from "monsoon_step_table" where id_step >= 0 and consumed_energy != 18446744073709551614 group by job_db_inx ) step on job.job_db_inx=step.job_db_inx set job.tres_alloc=concat(job.tres_alloc, concat(',3=', case when step.sum_energy then step.sum_energy else 18446744073709551614 END)) where job.tres_alloc != '' && job.tres_alloc not like '%,3=%':<br>
<br>
The system is no slouch:<br>
<br>
28 core, E5-2680 v4 2.4GHz<br>
SSD<br>
128GB memory<br>
<br>
Anyone have this issue? Anyone have a suggestion? This seems like a ridiculous amount of time needed to perform the upgrade! The database is healthy as far as I see. No errors in the slurmdbd log, etc.<br>
<br>
Let me know if you need more info!<br>
<br>
Best,<br>
Chris<br>
<br>
—<br>
Christopher Coffey<br>
High-Performance Computing<br>
Northern Arizona University<br>
928-523-1167<br>
<br>
<br>
</blockquote></div>