<div dir="ltr"><div>May try with this workaround</div><div><br></div><div>scontrol update NodeName=<node name> State=IDLE</div><div><br></div><div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">Thanks & Regards,<div>Sudeep Narayan Banerjee</div><div>System Analyst | Scientist B</div><div>Information System and Technology Facility</div><div>Indian Institute of Technology Gandhinagar</div><div>Palaj, Gujarat 382355, INDIA<br></div></div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Oct 28, 2020 at 5:41 PM Diego Zuccato <<a href="mailto:diego.zuccato@unibo.it">diego.zuccato@unibo.it</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello all.<br>
<br>
I've found that sometimes, some jobs leave the nodes in DRAINING state.<br>
<br>
In slurmctld.log I find:<br>
-8<--<br>
[2020-10-28T11:30:16.999] update_node: node str957-mtx-11 reason set to:<br>
Kill task failed<br>
[2020-10-28T11:30:16.999] update_node: node str957-mtx-11 state set to<br>
DRAINING<br>
-8<--<br>
while on the node (slurmd.log):<br>
-8<--<br>
[2020-10-28T11:24:11.980] [8975.0] task/cgroup:<br>
/slurm_str957-mtx-11/uid_2126297435/job_8975: alloc=117600MB<br>
mem.limit=117600MB memsw.limit=117600MB<br>
[2020-10-28T11:24:11.980] [8975.0] task/cgroup:<br>
/slurm_str957-mtx-11/uid_2126297435/job_8975/step_0: alloc=117600MB<br>
mem.limit=117600MB memsw.limit=117600MB<br>
[2020-10-28T11:29:18.926] [8975.0] Defering sending signal, processes in<br>
job are currently core dumping<br>
[2020-10-28T11:30:17.000] [8975.0] error: *** STEP 8975.0 STEPD<br>
TERMINATED ON str957-mtx-11 AT 2020-10-28T11:30:16 DUE TO JOB NOT ENDING<br>
WITH SIGNALS ***<br>
[2020-10-28T11:30:19.306] [8975.0] done with job<br>
-8<--<br>
<br>
Seems slurmd takes a bit too much time to close the job. Is there some<br>
timeout I could change to avoid having to fix it manually?<br>
<br>
TIA.<br>
<br>
-- <br>
Diego Zuccato<br>
DIFA - Dip. di Fisica e Astronomia<br>
Servizi Informatici<br>
Alma Mater Studiorum - Università di Bologna<br>
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy<br>
tel.: +39 051 20 95786<br>
<br>
</blockquote></div>