<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
Hi Herbert,</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
just like Angelos described, we also have logic in our poweroff script that checks if the node is really IDLE and only sends the poweroff command if that's the case.</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
Excerpt:</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<span style="font-family: "Courier New", monospace;">hosts=$(scontrol show hostnames $1)</span><br>
</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<span style="font-family: "Courier New", monospace;">for host in $hosts; do</span><span style="font-family: "Courier New", monospace;"><br>
</span></div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<div><span style="font-family: "Courier New", monospace;"> scontrol show node $host | tr ' ' '\n' | grep -q 'State=IDLE+POWER$'</span></div>
<div><span style="font-family: "Courier New", monospace;"> if [[ $? == 1 ]]; then</span></div>
<div><span style="font-family: "Courier New", monospace;"> echo "node $host NOT IDLE" >>$OUTFILE</span></div>
<div><span style="font-family: "Courier New", monospace;"> continue</span></div>
<div><span style="font-family: "Courier New", monospace;"> else</span></div>
<div><span style="font-family: "Courier New", monospace;"> echo "node $host IDLE" >>$OUTFILE</span></div>
<div><span style="font-family: "Courier New", monospace;"> fi</span></div>
<div><span style="font-family: "Courier New", monospace;"> ssh $host poweroff</span><br>
</div>
<div><span style="font-family: "Courier New", monospace;"> ...<br>
</span></div>
<div><span style="font-family: "Courier New", monospace;"> sleep 1</span><br>
</div>
<div><span style="font-family: "Courier New", monospace;"> ...<br>
</span></div>
<div><span style="font-family: "Courier New", monospace;">done</span></div>
<br>
</div>
<div>
<div id="Signature">
<div style="">
<div style="">Best,</div>
<div style="">Florian</div>
</div>
</div>
</div>
<div id="appendonsend"></div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);">
<br>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size: 11pt;" data-ogsc=""><b>From:</b> slurm-users <slurm-users-bounces@lists.schedmd.com> on behalf of Steininger, Herbert <herbert_steininger@psych.mpg.de><br>
<b>Sent:</b> Monday, 24 August 2020 10:52<br>
<b>To:</b> Slurm User Community List <slurm-users@lists.schedmd.com><br>
<b>Subject:</b> [External] [slurm-users] [slurm 20.02.3] don't suspend nodes in down state</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt">
<div class="PlainText">Hi,<br>
<br>
how can I prevent slurm, to suspend nodes, which I have set to down state for maintenance?<br>
I know about "SuspendExcNodes", but this doesn't seem the right way, to roll out the slurm.conf every time this changes.<br>
Is there a state that I can set so that the nodes doesn't get suspended?<br>
<br>
It happened a few times that I was doing some stuff on a server and after our idle time (1h) slurm decided to suspend the node.<br>
<br>
TIA,<br>
Herbert<br>
<br>
-- <br>
Herbert Steininger<br>
Leiter EDV & HPC<br>
Administrator<br>
Max-Planck-Institut für Psychiatrie<br>
Kraepelinstr. 2-10<br>
80804 München <br>
Tel +49 (0)89 / 30622-368<br>
Mail herbert_steininger@psych.mpg.de<br>
Web <a href="https://www.psych.mpg.de">https://www.psych.mpg.de</a><br>
<br>
<br>
<br>
</div>
</span></font></div>
</body>
</html>