[slurm-users] Problem with slurmctl communication with clurmdbd

Bruno Santos bacmsantos at gmail.com
Wed Nov 29 06:01:23 MST 2017


Hi everyone,

I have set-up slurm to use slurm_db and all was working fine. However I had
to change the slurm.conf to play with user priority and upon restarting the
slurmctl is fails with the following messages below. It seems that somehow
is trying to use the mysql password as a munge socket?
Any idea how to solve it?


> Nov 29 12:56:30 plantae slurmctld[29613]: Registering slurmctld at port
> 6817 with slurmdbd.
> Nov 29 12:56:32 plantae slurmctld[29613]: error: If munged is up, restart
> with --num-threads=10
> Nov 29 12:56:32 plantae slurmctld[29613]: error: Munge encode failed:
> Failed to access "magic": No such file or directory
> Nov 29 12:56:32 plantae slurmctld[29613]: error: authentication: Socket
> communication error
> Nov 29 12:56:32 plantae slurmctld[29613]: error: slurm_persist_conn_open:
> failed to send persistent connection init message to localhost:6819
> Nov 29 12:56:32 plantae slurmctld[29613]: error: slurmdbd: Sending
> PersistInit msg: Protocol authentication error
> Nov 29 12:56:34 plantae slurmctld[29613]: error: If munged is up, restart
> with --num-threads=10
> Nov 29 12:56:34 plantae slurmctld[29613]: error: Munge encode failed:
> Failed to access "magic": No such file or directory
> Nov 29 12:56:34 plantae slurmctld[29613]: error: authentication: Socket
> communication error
> Nov 29 12:56:34 plantae slurmctld[29613]: error: slurm_persist_conn_open:
> failed to send persistent connection init message to localhost:6819
> Nov 29 12:56:34 plantae slurmctld[29613]: error: slurmdbd: Sending
> PersistInit msg: Protocol authentication error
> Nov 29 12:56:36 plantae slurmctld[29613]: error: If munged is up, restart
> with --num-threads=10
> Nov 29 12:56:36 plantae slurmctld[29613]: error: Munge encode failed:
> Failed to access "magic": No such file or directory
> Nov 29 12:56:36 plantae slurmctld[29613]: error: authentication: Socket
> communication error
> Nov 29 12:56:36 plantae slurmctld[29613]: error: slurm_persist_conn_open:
> failed to send persistent connection init message to localhost:6819
> Nov 29 12:56:36 plantae slurmctld[29613]: error: slurmdbd: Sending
> PersistInit msg: Protocol authentication error
> Nov 29 12:56:36 plantae slurmctld[29613]: fatal: It appears you don't have
> any association data from your database.  The priority/multifactor plugin
> requires this information to run correctly.  Please check your database
> connection and try again.
> Nov 29 12:56:36 plantae systemd[1]: slurmctld.service: Main process
> exited, code=exited, status=1/FAILURE
> Nov 29 12:56:36 plantae systemd[1]: slurmctld.service: Unit entered failed
> state.
> Nov 29 12:56:36 plantae systemd[1]: slurmctld.service: Failed with result
> 'exit-code'.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171129/84d84582/attachment.html>


More information about the slurm-users mailing list