<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Hi Kevin,</p>
I have no experience with version 20 of slurm, but probably you have
some misconfiguration.<br>
Have you changed any settings in your slurm.conf file after the
upgrade?<br>
Dive into the documentation and verify if there aren't any changes
to some of the directives within the slurm.conf.<br>
<div class="moz-cite-prefix">On 05/11/2020 02:00, Kevin Buckley
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:6797fa17-2560-dde2-2863-bc92d2720db0@pawsey.org.au">
<br>
We have had a couple of nodes enter a DRAINED state where scontrol
<br>
gives the reason as
<br>
<br>
Reason=slurm.conf
<br>
<br>
In looking at the SlurmCtlD log we see pairs of lines as follows
<br>
<br>
update_node: node nid00245 reason set to: slurm.conf
<br>
update_node: node nid00245 state set to DRAINED
<br>
<br>
A search of the interweb thing for "Reason=slurm.conf" and
<br>
"reason set to: slurm.conf" has so far proved too much for
<br>
my search-fu.
<br>
<br>
The slurm.conf files do match across the Cray nodes so it
<br>
seems unlikely that it would be a "something out of sync "
<br>
thing.
<br>
<br>
Other "update_node" actions seen in the logs , eg,
<br>
<br>
"reason set to: NHC-Admindown"
<br>
<br>
also put the node into a DRAINED state but then, that's to
<br>
be expected.
<br>
<br>
We upgraded to 20.02.5 a couple of days ago and all of the
<br>
occurences of the "reason set to: slurm.conf" message only
<br>
appear after the update.
<br>
<br>
<br>
Anyone else seen anything like this, espcially any of you who
<br>
have just gone 20.02.5?
<br>
<br>
Yours, diving into the source soon,
<br>
Kevin
<br>
</blockquote>
<div class="moz-signature">-- <br>
<p>
<b>Cumprimentos / Best Regards,</b></p>
Zacarias Benta<br>
INCD @ LIP - Universidade do Minho<br>
<br>
<p>
<img src="https://www.incd.pt/img/incd-dark-logo.png" alt="INCD
Logo" width="181" height="93"> </p>
</div>
</body>
</html>