[slurm-users] "sacctmgr add cluster" crashing slurmdbd
Marcus Wagner
wagner at itc.rwth-aachen.de
Wed May 6 13:56:37 UTC 2020
Sorry, forgot, we use by the way, slurm 18.08.7
I just saw, in an earlier coredump, that there is another (earlier) line
involved:
2136: if (row2[ASSOC2_REQ_MTPJ][0])
the corresponding mysql response was:
+---------+------+------+------+------+-------+-------+-------+--------+-------+-------------+------+------------+
| @par_id | @mj | @mja | @mpt | @msj | @mwpj | @mtpj | @mtpn | @mtmpj |
@mtrm | @def_qos_id | @qos | @delta_qos |
+---------+------+------+------+------+-------+-------+-------+--------+-------+-------------+------+------------+
| 990 | 800 | NULL | NULL | 1000 | 1440 | NULL | NULL | NULL |
NULL | NULL | ,1, | NULL |
+---------+------+------+------+------+-------+-------+-------+--------+-------+-------------+------+------------+
1 row in set (0.00 sec)
So, here, @mtpj is NULL, in the other coredump, it was "1=8", so it was
not NULL. But @mtpn was NULL, so it segfaulted in
2141: if (row2[ASSOC2_REQ_MTPN][0])
Could anyone with a not segfaulting slurmdbd please use the call
directly in the database (it is a procedure generated by slurmdbd
collecting the parent limits of an association) and report the result here?
Best
Marcus
Am 06.05.2020 um 09:49 schrieb Ben Polman:
> On 06-05-2020 07:38, Chris Samuel wrote:
>
> We are experiencing exactly the same problem after mysql upgrade to 5.7.30,
> moving database to old mysql server running 5.6 solves the problem.
> Most likely downgrading mysql to 5.7.29 will work as well
>
> I have no clue which change in mysql-server is causing this
>
> best regards,
> Ben
>
>> On Tuesday, 5 May 2020 3:21:45 PM PDT Dustin Lang wrote:
>>
>>> Since this happens on a fresh new database, I just don't understand how I
>>> can get back to a basic functional state. This is exceedingly frustrating.
>> I have to say that if you're seeing this with 17.11, 18.08 and 19.05 and this
>> only started when your colleague upgraded MySQL then this sounds like MySQL is
>> triggering this problem.
>>
>> We're running with MariaDB 10.x (from SLES15) without issues (our database is
>> huge).
>>
>> All the best,
>> Chris
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5326 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200506/d01e8dc4/attachment.bin>
More information about the slurm-users
mailing list