[slurm-users] "sacctmgr add cluster" crashing slurmdbd

Dustin Lang dstndstn at gmail.com
Tue May 5 22:21:45 UTC 2020


Hi,

I've just upgraded to slurm 19.05.5.

With either my old database, OR creating an entirely new database, I am
unable to create a new 'cluster' entry in the database -- slurmdbd is
segfaulting!

# sacctmgr add cluster test3
 Adding Cluster(s)
  Name           = test3
Would you like to commit changes? (You have 30 seconds to decide)
(N/y): y
sacctmgr: error: slurm_persist_conn_open_without_init: failed to open
persistent connection to mn001:6819: Connection refused
sacctmgr: error: slurmdbd: Getting response to message type:
DBD_ADD_CLUSTERS
 Problem adding clusters: Unspecified error
sacctmgr: error: slurmdbd: Sending PersistInit msg: Connection refused

Meanwhile, running "slurmdbd -D -v -v -v -v -v", I see

[2020-05-05T18:17:19.503] debug4: 10(as_mysql_cluster.c:405) query
insert into txn_table (timestamp, action, name, actor, info) values
(1588717037, 1405, 'test3', 'root', 'mod_time=1588717037, shares=1,
grp_jobs=NULL, grp_jobs_accrue=NULL, grp_submit_jobs=NULL, grp_wall=NULL,
max_jobs=NULL, max_jobs_accrue=NULL, min_prio_thresh=NULL,
max_submit_jobs=NULL, max_wall_pj=NULL, priority=NULL, def_qos_id=NULL,
qos=\',1,\', federation=\'\', fed_id=0, fed_state=0, features=\'\'');
slurmdbd: debug4: 10(as_mysql_assoc.c:635) query
select id_assoc from "test3_assoc_table" where user='' and deleted = 0 and
acct='root';
[2020-05-05T18:17:19.506] debug4: 10(as_mysql_assoc.c:635) query
select id_assoc from "test3_assoc_table" where user='' and deleted = 0 and
acct='root';
slurmdbd: debug4: 10(as_mysql_assoc.c:714) query
call get_parent_limits('assoc_table', 'root', 'test3', 0); select @par_id,
@mj, @mja, @mpt, @msj, @mwpj, @mtpj, @mtpn, @mtmpj, @mtrm, @def_qos_id,
@qos, @delta_qos, @prio;
[2020-05-05T18:17:19.506] debug4: 10(as_mysql_assoc.c:714) query
call get_parent_limits('assoc_table', 'root', 'test3', 0); select @par_id,
@mj, @mja, @mpt, @msj, @mwpj, @mtpj, @mtpn, @mtmpj, @mtrm, @def_qos_id,
@qos, @delta_qos, @prio;
Segmentation fault (core dumped)


Since this happens on a fresh new database, I just don't understand how I
can get back to a basic functional state.  This is exceedingly frustrating.

Thanks for any hints.

--dustin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200505/c11cec38/attachment.htm>


More information about the slurm-users mailing list