<div dir="ltr"><div>We experienced the same problem. On our two new clusters with smaller databases (<1 million jobs), the upgrade from 17.02.9 to 17.11.2 and 17.11.3 was quick and smooth. On the third, older cluster, where we have a larger database (>30 million jobs) the upgrade was a mess, both in mysql and mariadb. It got stuck on that exact query, energy consumption in one job table. I did some tricks to get around it, only to get stuck on other queries instead. <br><br>I put some time on it without figuring out exactly why the conversion got stuck all the time. Then I decided to install 17.11 with a fresh database, and add necessary info to it. <br><br>Basically, all our policy information is regularly imported from an external infrastructure, so we could rerun those scripts to recreate the data. Keeping our historical accounting data "hot" in the database was also not needed, although it has been convenient at times -- hence we had not been actively purging it before. All things considered, I decided not to dig deeper into the conversion issue. <br><br></div><div>We're very happy with the performance of 17.11 now that it's up and running, they've cleaned up a bunch of unnecessary locks that have caused bottlenecks for us in the past. Good luck with the conversion!<br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Feb 22, 2018 at 1:30 AM, Christopher Benjamin Coffey <span dir="ltr"><<a href="mailto:Chris.Coffey@nau.edu" target="_blank">Chris.Coffey@nau.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">This is great to know Kurt. We can't be the only folks running into this.. I wonder if the mysql update code gets into a deadlock or something. I'm hoping a slurm dev will chime in ...<br>
<br>
Kurt, out of band if need be, I'd be interested in the details of what you ended up doing.<br>
<span class="im HOEnZb"><br>
Best,<br>
Chris<br>
<br>
—<br>
Christopher Coffey<br>
High-Performance Computing<br>
Northern Arizona University<br>
<a href="tel:928-523-1167" value="+19285231167">928-523-1167</a><br>
<br>
<br>
</span><div class="HOEnZb"><div class="h5">On 2/21/18, 5:08 PM, "slurm-users on behalf of Kurt H Maier" <<a href="mailto:slurm-users-bounces@lists.schedmd.com">slurm-users-bounces@lists.<wbr>schedmd.com</a> on behalf of <a href="mailto:khm@sciops.net">khm@sciops.net</a>> wrote:<br>
<br>
On Wed, Feb 21, 2018 at 11:56:38PM +0000, Christopher Benjamin Coffey wrote:<br>
> Hello,<br>
><br>
> We have been trying to upgrade slurm on our cluster from 16.05.6 to 17.11.3. I'm thinking this should be doable? Past upgrades have been a breeze, and I believe during the last one, the db upgrade took like 25 minutes. Well now, the db upgrade process is taking far too long. We previously attempted the upgrade during a maintenance window and the upgrade process did not complete after 24 hrs. I gave up on the upgrade and reverted the slurm version back by restoring a backup db.<br>
<br>
We hit this on our try as well: upgrading from 17.02.9 to 17.11.3. We<br>
truncated our job history for the upgrade, and then did the rest of the<br>
conversion out-of-band and re-imported it after the fact. It took us<br>
almost sixteen hours to convert a 1.5 million-job store.<br>
<br>
We got hung up on precisely the same query you did, on a similarly hefty<br>
machine. It caused us to roll back an upgrade and try again during our<br>
subsequent maintenance window with the above approach.<br>
<br>
khm<br>
<br>
<br>
<br>
</div></div></blockquote></div><br></div>