[slurm-users] Problem with slurmctl communication with clurmdbd

Andy Riebs andy.riebs at hpe.com
Wed Nov 29 06:28:44 MST 2017


It looks like you don't have the munged daemon running.

On 11/29/2017 08:01 AM, Bruno Santos wrote:
> Hi everyone,
>
> I have set-up slurm to use slurm_db and all was working fine. However 
> I had to change the slurm.conf to play with user priority and upon 
> restarting the slurmctl is fails with the following messages below. It 
> seems that somehow is trying to use the mysql password as a munge socket?
> Any idea how to solve it?
>
>     Nov 29 12:56:30 plantae slurmctld[29613]: Registering slurmctld at
>     port 6817 with slurmdbd.
>     Nov 29 12:56:32 plantae slurmctld[29613]: error: If munged is up,
>     restart with --num-threads=10
>     Nov 29 12:56:32 plantae slurmctld[29613]: error: Munge encode
>     failed: Failed to access "magic": No such file or directory
>     Nov 29 12:56:32 plantae slurmctld[29613]: error: authentication:
>     Socket communication error
>     Nov 29 12:56:32 plantae slurmctld[29613]: error:
>     slurm_persist_conn_open: failed to send persistent connection init
>     message to localhost:6819
>     Nov 29 12:56:32 plantae slurmctld[29613]: error: slurmdbd: Sending
>     PersistInit msg: Protocol authentication error
>     Nov 29 12:56:34 plantae slurmctld[29613]: error: If munged is up,
>     restart with --num-threads=10
>     Nov 29 12:56:34 plantae slurmctld[29613]: error: Munge encode
>     failed: Failed to access "magic": No such file or directory
>     Nov 29 12:56:34 plantae slurmctld[29613]: error: authentication:
>     Socket communication error
>     Nov 29 12:56:34 plantae slurmctld[29613]: error:
>     slurm_persist_conn_open: failed to send persistent connection init
>     message to localhost:6819
>     Nov 29 12:56:34 plantae slurmctld[29613]: error: slurmdbd: Sending
>     PersistInit msg: Protocol authentication error
>     Nov 29 12:56:36 plantae slurmctld[29613]: error: If munged is up,
>     restart with --num-threads=10
>     Nov 29 12:56:36 plantae slurmctld[29613]: error: Munge encode
>     failed: Failed to access "magic": No such file or directory
>     Nov 29 12:56:36 plantae slurmctld[29613]: error: authentication:
>     Socket communication error
>     Nov 29 12:56:36 plantae slurmctld[29613]: error:
>     slurm_persist_conn_open: failed to send persistent connection init
>     message to localhost:6819
>     Nov 29 12:56:36 plantae slurmctld[29613]: error: slurmdbd: Sending
>     PersistInit msg: Protocol authentication error
>     Nov 29 12:56:36 plantae slurmctld[29613]: fatal: It appears you
>     don't have any association data from your database.  The
>     priority/multifactor plugin requires this information to run
>     correctly.  Please check your database connection and try again.
>     Nov 29 12:56:36 plantae systemd[1]: slurmctld.service: Main
>     process exited, code=exited, status=1/FAILURE
>     Nov 29 12:56:36 plantae systemd[1]: slurmctld.service: Unit
>     entered failed state.
>     Nov 29 12:56:36 plantae systemd[1]: slurmctld.service: Failed with
>     result 'exit-code'.
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171129/09f6e45f/attachment.html>


More information about the slurm-users mailing list