[slurm-users] Problem with slurmctl communication with clurmdbd
Andy Riebs
andy.riebs at hpe.com
Wed Nov 29 06:28:44 MST 2017
It looks like you don't have the munged daemon running.
On 11/29/2017 08:01 AM, Bruno Santos wrote:
> Hi everyone,
>
> I have set-up slurm to use slurm_db and all was working fine. However
> I had to change the slurm.conf to play with user priority and upon
> restarting the slurmctl is fails with the following messages below. It
> seems that somehow is trying to use the mysql password as a munge socket?
> Any idea how to solve it?
>
> Nov 29 12:56:30 plantae slurmctld[29613]: Registering slurmctld at
> port 6817 with slurmdbd.
> Nov 29 12:56:32 plantae slurmctld[29613]: error: If munged is up,
> restart with --num-threads=10
> Nov 29 12:56:32 plantae slurmctld[29613]: error: Munge encode
> failed: Failed to access "magic": No such file or directory
> Nov 29 12:56:32 plantae slurmctld[29613]: error: authentication:
> Socket communication error
> Nov 29 12:56:32 plantae slurmctld[29613]: error:
> slurm_persist_conn_open: failed to send persistent connection init
> message to localhost:6819
> Nov 29 12:56:32 plantae slurmctld[29613]: error: slurmdbd: Sending
> PersistInit msg: Protocol authentication error
> Nov 29 12:56:34 plantae slurmctld[29613]: error: If munged is up,
> restart with --num-threads=10
> Nov 29 12:56:34 plantae slurmctld[29613]: error: Munge encode
> failed: Failed to access "magic": No such file or directory
> Nov 29 12:56:34 plantae slurmctld[29613]: error: authentication:
> Socket communication error
> Nov 29 12:56:34 plantae slurmctld[29613]: error:
> slurm_persist_conn_open: failed to send persistent connection init
> message to localhost:6819
> Nov 29 12:56:34 plantae slurmctld[29613]: error: slurmdbd: Sending
> PersistInit msg: Protocol authentication error
> Nov 29 12:56:36 plantae slurmctld[29613]: error: If munged is up,
> restart with --num-threads=10
> Nov 29 12:56:36 plantae slurmctld[29613]: error: Munge encode
> failed: Failed to access "magic": No such file or directory
> Nov 29 12:56:36 plantae slurmctld[29613]: error: authentication:
> Socket communication error
> Nov 29 12:56:36 plantae slurmctld[29613]: error:
> slurm_persist_conn_open: failed to send persistent connection init
> message to localhost:6819
> Nov 29 12:56:36 plantae slurmctld[29613]: error: slurmdbd: Sending
> PersistInit msg: Protocol authentication error
> Nov 29 12:56:36 plantae slurmctld[29613]: fatal: It appears you
> don't have any association data from your database. The
> priority/multifactor plugin requires this information to run
> correctly. Please check your database connection and try again.
> Nov 29 12:56:36 plantae systemd[1]: slurmctld.service: Main
> process exited, code=exited, status=1/FAILURE
> Nov 29 12:56:36 plantae systemd[1]: slurmctld.service: Unit
> entered failed state.
> Nov 29 12:56:36 plantae systemd[1]: slurmctld.service: Failed with
> result 'exit-code'.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171129/09f6e45f/attachment.html>
More information about the slurm-users
mailing list