[slurm-users] Extreme long db upgrade 16.05.6 -> 17.11.3
Christopher Benjamin Coffey
Chris.Coffey at nau.edu
Wed Feb 21 17:30:05 MST 2018
This is great to know Kurt. We can't be the only folks running into this.. I wonder if the mysql update code gets into a deadlock or something. I'm hoping a slurm dev will chime in ...
Kurt, out of band if need be, I'd be interested in the details of what you ended up doing.
Northern Arizona University
On 2/21/18, 5:08 PM, "slurm-users on behalf of Kurt H Maier" <slurm-users-bounces at lists.schedmd.com on behalf of khm at sciops.net> wrote:
On Wed, Feb 21, 2018 at 11:56:38PM +0000, Christopher Benjamin Coffey wrote:
> We have been trying to upgrade slurm on our cluster from 16.05.6 to 17.11.3. I'm thinking this should be doable? Past upgrades have been a breeze, and I believe during the last one, the db upgrade took like 25 minutes. Well now, the db upgrade process is taking far too long. We previously attempted the upgrade during a maintenance window and the upgrade process did not complete after 24 hrs. I gave up on the upgrade and reverted the slurm version back by restoring a backup db.
We hit this on our try as well: upgrading from 17.02.9 to 17.11.3. We
truncated our job history for the upgrade, and then did the rest of the
conversion out-of-band and re-imported it after the fact. It took us
almost sixteen hours to convert a 1.5 million-job store.
We got hung up on precisely the same query you did, on a similarly hefty
machine. It caused us to roll back an upgrade and try again during our
subsequent maintenance window with the above approach.
More information about the slurm-users