<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<font face="Helvetica, Arial, sans-serif">We've just installed
17.11.0 on our 100+ node x86_64 cluster running CentOS 7.4 this
afternoon, and periodically see a single node (perhaps the first
node in an allocation?) get drained with the message "batch job
complete failure".</font><br>
<br>
<font face="Helvetica, Arial, sans-serif">On one node in question,
slurmd.log reports</font><br>
<font face="Helvetica, Arial, sans-serif"> </font>
<style>
<!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:"Segoe UI";
panose-1:2 11 5 2 4 2 4 2 2 3;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin-top:0in;
margin-right:0in;
margin-bottom:8.0pt;
margin-left:0in;
line-height:107%;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
.MsoChpDefault
{font-family:"Calibri",sans-serif;}
.MsoPapDefault
{margin-bottom:8.0pt;
line-height:107%;}
/* Page Definitions */
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
-->
</style>
<div class="WordSection1">
<blockquote><font face="Helvetica, Arial, sans-serif"><span
style="font-size: 10pt; color: black;">pam_unix(slurm:session):
open_session - error recovering username</span><span
style="font-size: 10pt; color: black;"> <br>
pam_loginuid(slurm:session): unexpected response from failed
conversation function </span></font></blockquote>
</div>
<font face="Helvetica, Arial, sans-serif">On another node drained
for the same reason,</font><br>
<blockquote><font face="Helvetica, Arial, sans-serif">error:
pam_open_session: Cannot make/remove an entry for the specified
session<br>
error: error in pam_setup<br>
error: job_manager exiting abnormally, rc = 4020<br>
sending REQUEST_COMPLETE_BATCH_SCRIPT, error:4020 status 0<br>
</font></blockquote>
<font face="Helvetica, Arial, sans-serif">slurmctld has logged</font><br>
<font face="Helvetica, Arial, sans-serif"> </font>
<style>
<!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:"Segoe UI";
panose-1:2 11 5 2 4 2 4 2 2 3;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin-top:0in;
margin-right:0in;
margin-bottom:8.0pt;
margin-left:0in;
line-height:107%;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
.MsoChpDefault
{font-family:"Calibri",sans-serif;}
.MsoPapDefault
{margin-bottom:8.0pt;
line-height:107%;}
/* Page Definitions */
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style>
<div class="WordSection1"><font face="Helvetica, Arial, sans-serif"><span
style="font-size: 10pt; color: black;"></span></font>
<blockquote><font face="Helvetica, Arial, sans-serif"><span
style="font-size: 10pt; color: black;"> error: slurmd error
running JobId=33 on node(s)=node048: Slurmd could not execve
job </span></font><br>
<br>
<font face="Helvetica, Arial, sans-serif"><span
style="font-size: 10pt; color: black;">drain_nodes: node
Summer0c048 state set to DRAIN</span></font></blockquote>
<font face="Helvetica, Arial, sans-serif"><span style="font-size:
10pt; color: black;">If anyone can shine some light on where I
should start looking, I shall be most obliged!</span></font><br>
<br>
<span style="font-size:10.0pt;font-family:"Segoe
UI",sans-serif; color:black"><font face="Helvetica, Arial,
sans-serif">Andy</font><br>
</span><span style="font-size:10.0pt;font-family:"Segoe
UI",sans-serif;color:black"> </span><br>
</div>
<pre class="moz-signature" cols="72">--
Andy Riebs
<a class="moz-txt-link-abbreviated" href="mailto:andy.riebs@hpe.com">andy.riebs@hpe.com</a>
Hewlett-Packard Enterprise
High Performance Computing Software Engineering
+1 404 648 9024
My opinions are not necessarily those of HPE
May the source be with you!
</pre>
</body>
</html>