<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>The two most likely causes of munge complaints:</p>
<p>1. Different keys in /etc/munge/munge.key<br>
2. Clocks out of sync on the nodes in question<br>
</p>
<p>Andy<br>
</p>
<br>
<div class="moz-cite-prefix">On 05/07/2018 03:50 PM, Eric F. Alemany
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:BA7B27BD-134B-4BF4-83DB-47395A399C0D@stanford.edu">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Greetings,
<div class=""><br class="">
</div>
<div class="">Reminder: i am new to SLURM.</div>
<div class=""><br class="">
</div>
<div class="">When i execute “sinfo” my nodes are down.</div>
<div class=""><br class="">
</div>
<div class="">
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; background-color: rgb(255, 255, 255);"
class="">
<span style="font-variant-ligatures: no-common-ligatures"
class="">sinfo</span></div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; background-color: rgb(255, 255, 255);"
class="">
<span style="font-variant-ligatures: no-common-ligatures"
class="">PARTITION AVAIL TIMELIMIT NODES STATE NODELIST</span></div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; background-color: rgb(255, 255, 255);"
class="">
<span style="font-variant-ligatures: no-common-ligatures"
class="">debug* up infinite 4 down*
radonc[01-04]</span></div>
</div>
<div class=""><br class="">
</div>
<div class="">This is what i have done so far and nothing has
helped. The nodes are in “idle” state for 2-3 minutes and then
there are “down” again.</div>
<div class=""><br class="">
</div>
<div class="">
<div style="margin: 0px; line-height: normal;" class=""><span
style="font-kerning: none" class="">systemctl restart
slurmd on all nodes</span></div>
<div style="margin: 0px; line-height: normal; min-height: 14px;"
class=""><span style="font-kerning: none" class=""></span><br
class="">
</div>
<div style="margin: 0px; line-height: normal;" class=""><span
style="font-kerning: none" class="">systemctl restart
slurmctld on master</span></div>
<div style="margin: 0px; line-height: normal; min-height: 14px;"
class=""><span style="font-kerning: none" class=""></span><br
class="">
</div>
<div style="margin: 0px; line-height: normal;" class=""><span
style="font-kerning: none" class="">scontrol update
node=radonc[01-04] state=UNDRAIN</span></div>
</div>
<div style="margin: 0px; line-height: normal;" class=""><span
style="font-kerning: none" class=""><br class="">
</span></div>
<div style="margin: 0px; line-height: normal;" class=""><span
style="font-kerning: none" class="">scontrol update
node=radonc[01-04] state=IDLE</span></div>
<div style="margin: 0px; line-height: normal;" class=""><span
style="font-kerning: none" class=""><br class="">
</span></div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">I looked at the log file in <span style="color:
rgb(34, 34, 34); background-color: rgb(247, 247, 247);"
class="">/var/log/</span><span class="skimlinks-unlinked"
style="color: rgb(34, 34, 34); border: 0px; margin: 0px;
padding: 0px; vertical-align: baseline;">SlurmdLogFile.log
and saw some “munge decode failed: Invalid credential”</span></div>
<div class=""><span class="skimlinks-unlinked" style="color:
rgb(34, 34, 34); border: 0px; margin: 0px; padding: 0px;
vertical-align: baseline;"><br class="">
</span></div>
<div class=""><span class="skimlinks-unlinked" style="color:
rgb(34, 34, 34); border: 0px; margin: 0px; padding: 0px;
vertical-align: baseline;">
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; color: rgb(0, 0, 0); background-color:
rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures"
class="">[2018-05-07T12:37:20.028] error:
slurm_unpack_received_msg:
MESSAGE_NODE_REGISTRATION_STATUS has authentication error:
Invalid credential </span></div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; color: rgb(0, 0, 0); background-color:
rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures"
class="">[2018-05-07T12:37:20.028] error:
slurm_unpack_received_msg: Protocol authentication error</span></div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; color: rgb(0, 0, 0); background-color:
rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures"
class="">[2018-05-07T12:37:20.028] error: Munge decode
failed: Invalid credential</span></div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; color: rgb(0, 0, 0); background-color:
rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures"
class="">[2018-05-07T12:37:20.028] error:
slurm_unpack_received_msg:
MESSAGE_NODE_REGISTRATION_STATUS has authentication error:
Invalid credential </span></div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; color: rgb(0, 0, 0); background-color:
rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures"
class="">[2018-05-07T12:37:20.028] error:
slurm_unpack_received_msg: Protocol authentication error</span></div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; color: rgb(0, 0, 0); background-color:
rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures"
class="">[2018-05-07T12:37:20.038] error:
slurm_receive_msg [10.112.0.14:42140]: Unspecified error</span></div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; color: rgb(0, 0, 0); background-color:
rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures"
class="">[2018-05-07T12:37:20.038] error:
slurm_receive_msg [10.112.0.5:34752]: Unspecified error</span></div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; color: rgb(0, 0, 0); background-color:
rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures"
class="">[2018-05-07T12:37:20.038] error:
slurm_receive_msg [10.112.0.6:46746]: Unspecified error</span></div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; color: rgb(0, 0, 0); background-color:
rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures"
class="">[2018-05-07T12:37:20.039] error:
slurm_receive_msg [10.112.0.16:50788]: Unspecified error</span></div>
</span></div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">I ran the following command on all nodes (including
master/headnode) and got “Success”</div>
<div class=""><br class="">
</div>
<div class="">
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; background-color: rgb(255, 255, 255);"
class="">
<span style="font-variant-ligatures: no-common-ligatures"
class=""> munge -n | unmunge | grep STATUS</span></div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; background-color: rgb(255, 255, 255);"
class="">
<span style="font-variant-ligatures: no-common-ligatures"
class=""><b class="">STATUS</b>: Success (0)</span></div>
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">How can I fix this problem?</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">Thank you in advance for all your help.</div>
<div class=""><br class="">
</div>
<div class="">Eric</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">
<div style="color: rgb(0, 0, 0); letter-spacing: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;
word-wrap: break-word; -webkit-nbsp-mode: space;
-webkit-line-break: after-white-space;" class="">
<div style="color: rgb(0, 0, 0); letter-spacing: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;
word-wrap: break-word; -webkit-nbsp-mode: space;
-webkit-line-break: after-white-space;" class="">
<div style="text-align: -webkit-auto; orphans: 2; widows:
2; word-wrap: break-word; -webkit-nbsp-mode: space;
-webkit-line-break: after-white-space;" class="">
<div style="orphans: auto; widows: auto;" class=""><span
style="text-align: -webkit-auto; background-color:
rgba(255, 255, 255, 0);" class="">_____________________________________________________________________________________________________</span></div>
<div style="orphans: auto; widows: auto;" class=""><span
style="background-color: rgba(255, 255, 255, 0);"
class=""><br class="">
</span></div>
<span style="background-color: rgba(255, 255, 255, 0);"
class=""><b class="">
<div style="orphans: auto; widows: auto;" class=""><b
style="text-align: -webkit-auto;" class="">Eric
F. Alemany</b></div>
</b>
<div style="orphans: auto; widows: auto;" class=""><i
style="text-align: -webkit-auto;" class="">System
Administrator for Research</i></div>
</span>
<div style="orphans: auto; widows: auto;" class=""><span
style="background-color: rgba(255, 255, 255, 0);"
class=""><br class="">
</span></div>
<div style="orphans: auto; widows: auto;" class=""><span
style="text-align: -webkit-auto; background-color:
rgba(255, 255, 255, 0);" class="">Division of
Radiation & Cancer Biology</span></div>
<div style="orphans: auto; widows: auto;" class=""><span
style="text-align: -webkit-auto; background-color:
rgba(255, 255, 255, 0);" class="">Department of
Radiation Oncology</span></div>
<div style="orphans: auto; widows: auto;" class=""><span
style="background-color: rgba(255, 255, 255, 0);"
class=""><br class="">
</span></div>
<div style="orphans: auto; widows: auto;" class=""><span
style="text-align: -webkit-auto; background-color:
rgba(255, 255, 255, 0);" class="">Stanford
University School of Medicine</span></div>
<div style="orphans: auto; widows: auto;" class=""><span
style="text-align: -webkit-auto; background-color:
rgba(255, 255, 255, 0);" class="">Stanford,
California 94305</span></div>
<div style="orphans: auto; widows: auto;" class=""><span
style="background-color: rgba(255, 255, 255, 0);"
class=""><br class="">
</span></div>
<div style="orphans: auto; widows: auto;" class=""><span
style="background-color: rgba(255, 255, 255, 0);"
class=""><font style="text-align: -webkit-auto;"
class="">Tel:</font><a href="tel:1-650-498-7969"
x-apple-data-detectors="true"
x-apple-data-detectors-type="telephone"
x-apple-data-detectors-result="1"
style="text-align: -webkit-auto;" class=""
moz-do-not-send="true">1-650-498-7969</a><font
style="text-align: -webkit-auto;" class=""> No
Texting</font></span></div>
<div style="orphans: auto; widows: auto;" class=""><span
style="background-color: rgba(255, 255, 255, 0);"
class=""><font style="text-align: -webkit-auto;"
class="">Fax:</font><a href="tel:1-650-723-7382"
x-apple-data-detectors="true"
x-apple-data-detectors-type="telephone"
x-apple-data-detectors-result="2"
style="text-align: -webkit-auto;" class=""
moz-do-not-send="true">1-650-723-7382</a></span></div>
<div style="orphans: auto; widows: auto;" class=""><br
class="">
</div>
</div>
<div style="word-wrap: break-word; -webkit-nbsp-mode:
space; -webkit-line-break: after-white-space;" class="">
</div>
</div>
</div>
<br class="Apple-interchange-newline">
</div>
<br class="">
</div>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Andy Riebs
<a class="moz-txt-link-abbreviated" href="mailto:andy.riebs@hpe.com">andy.riebs@hpe.com</a>
Hewlett-Packard Enterprise
High Performance Computing Software Engineering
+1 404 648 9024
My opinions are not necessarily those of HPE
May the source be with you!
</pre>
</body>
</html>