[slurm-users] Extreme long db upgrade 16.05.6 -> 17.11.3

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Wed Apr 3 11:28:55 UTC 2019


Hi Lech,

Maybe you could add your arguments to the bug report 
https://bugs.schedmd.com/show_bug.cgi?id=6796 hoping that SchedMD may be 
convinced that this is a useful patch for future versions of Slurm, also 
for MySQL/MariaDB versions 5.5 and newer.

Best regards,
Ole


On 4/3/19 1:17 PM, Lech Nieroda wrote:
> Hi Ole,
> 
>> Am 03.04.2019 um 12:53 schrieb Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk>:
>> SchedMD already decided that they won't fix the problem:
> 
> Yes, I guess it’s a bit late in the release lifecycles. Nevertheless it’s a pity, as there are certainly a lot of users around who’d rather not upgrade their distribution default mysql-servers just for the sake of a conversion.
> 
>> Can you confirm that your patch is only relevant for an old MySQL 5.1?
>>
>> On our CentOS 7 systems we run the OS's MariaDB server 5.5.  Would MySQL/MariaDB version 5.5 be affected by your patch or not?
> 
> The patch will work with any mysql version >= 5.1, since all it does is simplify the query by changing an implicit derived table to an explicit temporary table.
> This way the query complexity is reduced and its execution order doesn’t depend on the „intelligence“ of the mysql optimizer while presenting exactly the same end results.
> 
> We haven’t tested mysql 5.5 whether its optimizer chooses the right execution plan with this query.
> As I’ve said, it took roughly 17 minutes with 11 million jobs, 18 million steps and a innodb buffer pool size of 8G.
> If the table conversion takes more than half an hour and you don’t have tens of millions of jobs then the optimizer has a problem and the patch would help you.
> 
> 
> Kind regards,
> Lech
> 
> 
>>
>> Best regards,
>> Ole
>>
>> On 4/3/19 12:30 PM, Lech Nieroda wrote:
>>> Hello Chris,
>>> I’ve submitted the bug report together with a patch.
>>> We don’t have a  support contract but I suppose they’ll at least read it ;)
>>> The code is identical for 18.08.x and 19.05.x, it’s just a different offset.
>>> Kind regards,
>>> Lech
>>>> Am 02.04.2019 um 15:18 schrieb Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk>:
>>>>
>>>> Hi Lech,
>>>>
>>>> IMHO, the Slurm user community would benefit the most from your interesting work on MySQL/MariaDB performance, if https://bugs.schedmd.com/show_bug.cgi?id=6796your patch could be made against the current 18.08 and the coming 19.05 releases.  This would ensure that your work is carried forward.
>>>>
>>>> Would you be able to make patches against 18.08 and 19.05?  If you submit the patches to SchedMD, my guess is that they'd be very interested.  A site with a SchedMD support contract (such as our site) could also submit a bug report including your patch.
>>>>
>>>> /Ole
>>>>
>>>> On 4/2/19 2:56 PM, Lech Nieroda wrote:
>>>>> That’s probably it.
>>>>> Sub-queries are known for potential performance issues, so one wonders why the devs didn’t extract it accordingly and made the code more robust or at least compatible with RHEL/CentOS 6 rather than including that remark in the release notes.
>>>>>> Am 02.04.2019 um 07:20 schrieb Chris Samuel <chris at csamuel.org>:
>>>>>>
>>>>>> On Monday, 1 April 2019 7:55:09 AM PDT Lech Nieroda wrote:
>>>>>>
>>>>>>> Further analysis of the query has shown that the mysql optimizer has choosen
>>>>>>> the wrong execution plan. This may depend on the mysql version, ours was
>>>>>>> 5.1.69.
>>>>>>
>>>>>> I suspect this is the issue documented in the release notes for 17.11:
>>>>>>
>>>>>> https://github.com/SchedMD/slurm/blob/slurm-17.11/RELEASE_NOTES
>>>>>>
>>>>>> NOTE FOR THOSE UPGRADING SLURMDBD: The database conversion process from
>>>>>>       SlurmDBD 16.05 or 17.02 may not work properly with MySQL 5.1 (as was the
>>>>>>       default version for RHEL 6).  Upgrading to a newer version of MariaDB or
>>>>>>       MySQL is strongly encouraged to prevent this problem.
>>
> 
> 




More information about the slurm-users mailing list