<div dir="ltr">I actually just managed to figure that one out. <div><br></div><div>The problem was that I had setup AccountingStoragePass=magic in the slurm.conf file while after re-reading the documentation it seems this is only needed if I have a different munge instance controlling the logins to the database, which I don't. </div><div>So commenting that line out seems to have worked however I am now getting a different error: </div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Nov 29 13:19:20 plantae slurmctld[29984]: Registering slurmctld at port 6817 with slurmdbd.<br>Nov 29 13:19:20 plantae slurmctld[29984]: error: slurm_persist_conn_open: Something happened with the receiving/processing of the persistent connection init message to localhost:6819: Initial RPC not DBD_INIT<br>Nov 29 13:19:20 plantae systemd[1]: slurmctld.service: Main process exited, code=exited, status=1/FAILURE<br>Nov 29 13:19:20 plantae systemd[1]: slurmctld.service: Unit entered failed state.<br>Nov 29 13:19:20 plantae systemd[1]: slurmctld.service: Failed with result 'exit-code'.</blockquote><div><br></div><div>My slurm.conf looks like this</div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"># LOGGING AND ACCOUNTING<br>AccountingStorageHost=localhost<br>AccountingStorageLoc=slurm_db<br>#AccountingStoragePass=magic<br>#AccountingStoragePort=<br>AccountingStorageType=accounting_storage/slurmdbd<br>AccountingStorageUser=slurm<br>AccountingStoreJobComment=YES<br>ClusterName=research<br>JobCompType=jobcomp/none<br>JobAcctGatherFrequency=30<br>JobAcctGatherType=jobacct_gather/none<br>SlurmctldDebug=3<br>SlurmdDebug=3</blockquote><div><br></div><div>And the slurdbd.conf like this:</div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">ArchiveEvents=yes<br>ArchiveJobs=yes<br>ArchiveResvs=yes<br>ArchiveSteps=no<br>#ArchiveTXN=no<br>#ArchiveUsage=no<br># Authentication info<br>AuthType=auth/munge<br>AuthInfo=/var/run/munge/munge.socket.2</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">#Database info<br># slurmDBD info<br>DbdAddr=plantae<br>DbdHost=plantae<br># Database info<br>StorageType=accounting_storage/mysql<br>StorageHost=localhost<br>SlurmUser=slurm<br>StoragePass=magic<br>StorageUser=slurm<br>StorageLoc=slurm_db</blockquote></div><div><br></div><div><br></div></div><div>Thank you very much in advance. </div><div><br></div><div>Best,</div><div>Bruno </div><div><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 29 November 2017 at 13:28, Andy Riebs <span dir="ltr"><<a href="mailto:andy.riebs@hpe.com" target="_blank">andy.riebs@hpe.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000000">
    It looks like you don't have the munged daemon running.<div><div class="h5"><br>
    <br>
    <div class="m_4940209100258878838moz-cite-prefix">On 11/29/2017 08:01 AM, Bruno Santos
      wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">Hi everyone,
        <div><br>
        </div>
        <div>I have set-up slurm to use slurm_db and all was working
          fine. However I had to change the slurm.conf to play with user
          priority and upon restarting the slurmctl is fails with the
          following messages below. It seems that somehow is trying to
          use the mysql password as a munge socket? </div>
        <div>Any idea how to solve it? </div>
        <div> </div>
        <div>
          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Nov 29 12:56:30 plantae
            slurmctld[29613]: Registering slurmctld at port 6817 with
            slurmdbd.<br>
            Nov 29 12:56:32 plantae slurmctld[29613]: error: If munged
            is up, restart with --num-threads=10<br>
            Nov 29 12:56:32 plantae slurmctld[29613]: error: Munge
            encode failed: Failed to access "magic": No such file or
            directory<br>
            Nov 29 12:56:32 plantae slurmctld[29613]: error:
            authentication: Socket communication error<br>
            Nov 29 12:56:32 plantae slurmctld[29613]: error:
            slurm_persist_conn_open: failed to send persistent
            connection init message to localhost:6819<br>
            Nov 29 12:56:32 plantae slurmctld[29613]: error: slurmdbd:
            Sending PersistInit msg: Protocol authentication error<br>
            Nov 29 12:56:34 plantae slurmctld[29613]: error: If munged
            is up, restart with --num-threads=10<br>
            Nov 29 12:56:34 plantae slurmctld[29613]: error: Munge
            encode failed: Failed to access "magic": No such file or
            directory<br>
            Nov 29 12:56:34 plantae slurmctld[29613]: error:
            authentication: Socket communication error<br>
            Nov 29 12:56:34 plantae slurmctld[29613]: error:
            slurm_persist_conn_open: failed to send persistent
            connection init message to localhost:6819<br>
            Nov 29 12:56:34 plantae slurmctld[29613]: error: slurmdbd:
            Sending PersistInit msg: Protocol authentication error<br>
            Nov 29 12:56:36 plantae slurmctld[29613]: error: If munged
            is up, restart with --num-threads=10<br>
            Nov 29 12:56:36 plantae slurmctld[29613]: error: Munge
            encode failed: Failed to access "magic": No such file or
            directory<br>
            Nov 29 12:56:36 plantae slurmctld[29613]: error:
            authentication: Socket communication error<br>
            Nov 29 12:56:36 plantae slurmctld[29613]: error:
            slurm_persist_conn_open: failed to send persistent
            connection init message to localhost:6819<br>
            Nov 29 12:56:36 plantae slurmctld[29613]: error: slurmdbd:
            Sending PersistInit msg: Protocol authentication error<br>
            Nov 29 12:56:36 plantae slurmctld[29613]: fatal: It appears
            you don't have any association data from your database.  The
            priority/multifactor plugin requires this information to run
            correctly.  Please check your database connection and try
            again.<br>
            Nov 29 12:56:36 plantae systemd[1]: slurmctld.service: Main
            process exited, code=exited, status=1/FAILURE<br>
            Nov 29 12:56:36 plantae systemd[1]: slurmctld.service: Unit
            entered failed state.<br>
            Nov 29 12:56:36 plantae systemd[1]: slurmctld.service:
            Failed with result 'exit-code'.</blockquote>
          <div><br>
          </div>
          <div> </div>
        </div>
      </div>
    </blockquote>
    <br>
  </div></div></div>

</blockquote></div><br></div>