[slurm-users] "sacctmgr add cluster" crashing slurmdbd

Marcus Wagner wagner at itc.rwth-aachen.de
Wed May 6 13:56:37 UTC 2020


Sorry, forgot, we use by the way, slurm 18.08.7

I just saw, in an earlier coredump, that there is another (earlier) line 
involved:

2136:             if (row2[ASSOC2_REQ_MTPJ][0])

the corresponding mysql response was:

+---------+------+------+------+------+-------+-------+-------+--------+-------+-------------+------+------------+
| @par_id | @mj  | @mja | @mpt | @msj | @mwpj | @mtpj | @mtpn | @mtmpj | 
@mtrm | @def_qos_id | @qos | @delta_qos |
+---------+------+------+------+------+-------+-------+-------+--------+-------+-------------+------+------------+
|     990 |  800 | NULL | NULL | 1000 |  1440 | NULL  | NULL  | NULL   | 
NULL  |        NULL | ,1,  | NULL       |
+---------+------+------+------+------+-------+-------+-------+--------+-------+-------------+------+------------+
1 row in set (0.00 sec)


So, here, @mtpj is NULL, in the other coredump, it was "1=8", so it was 
not NULL. But @mtpn was NULL, so it segfaulted in

2141:            if (row2[ASSOC2_REQ_MTPN][0])

Could anyone with a not segfaulting slurmdbd please use the call 
directly in the database (it is a procedure generated by slurmdbd 
collecting the parent limits of an association) and report the result here?


Best
Marcus

Am 06.05.2020 um 09:49 schrieb Ben Polman:
> On 06-05-2020 07:38, Chris Samuel wrote:
> 
> We are experiencing exactly the same problem after mysql upgrade to 5.7.30,
> moving database to old mysql server running 5.6 solves the problem.
> Most likely downgrading mysql to 5.7.29 will work as well
> 
> I have no clue which change in mysql-server is causing this
> 
> best regards,
> Ben
> 
>> On Tuesday, 5 May 2020 3:21:45 PM PDT Dustin Lang wrote:
>>
>>> Since this happens on a fresh new database, I just don't understand how I
>>> can get back to a basic functional state.  This is exceedingly frustrating.
>> I have to say that if you're seeing this with 17.11, 18.08 and 19.05 and this
>> only started when your colleague upgraded MySQL then this sounds like MySQL is
>> triggering this problem.
>>
>> We're running with MariaDB 10.x (from SLES15) without issues (our database is
>> huge).
>>
>> All the best,
>> Chris
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5326 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200506/d01e8dc4/attachment.bin>


More information about the slurm-users mailing list