<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Geneva;
panose-1:2 11 5 3 3 4 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
span.E-MailFormatvorlage17
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;}
span.apple-converted-space
{mso-style-name:apple-converted-space;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 70.85pt 2.0cm 70.85pt;}
div.WordSection1
{page:WordSection1;}
--></style>
</head>
<body lang="DE-CH" link="#0563C1" vlink="#954F72">
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">Hi all,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">according to the SLURM documentation, SIGCONT and SIGTERM signals are sent twice to a job that is selected for preemption:</span><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:#46545C;background:white;mso-fareast-language:DE">“Once a job has been selected for preemption, its end time is set to the current time plus </span></i><i><span lang="EN-US" style="font-size:11.0pt;color:#46545C;border:none windowtext 1.0pt;padding:0cm;mso-fareast-language:DE">GraceTime</span></i><i><span lang="EN-US" style="font-size:11.0pt;color:#46545C;background:white;mso-fareast-language:DE">.
The job is immediately sent SIGCONT and SIGTERM signals in order to provide notification of its imminent termination. This is followed by the SIGCONT, SIGTERM and SIGKILL signal sequence upon reaching its new end time.”</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">While I can trap the first SIGTERM in a job submitted with <i>srun</i> or in a job step launched with <i>srun</i> (from inside a batch script submitted with
<i>sbatch</i>), I cannot trap the first SIGTERM in a batch script submitted with <i>
sbatch</i>, i.e. the batch script only receives a SIGTERM after GraceTime has expired. Why is the first SIGTERM not sent to the batch shell? I use the following test job:</span><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">#!/bin/bash</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">housekeeping() {</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> echo "$(date): Cleaning up..." >> job.log</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> sleep 10</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> echo "$(date): Done." >> job.log</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> exit 1</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">}</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">trap 'housekeeping' TERM</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">echo "$(date): Starting batch job." >> job.log</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">while true; do</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> sleep 2 &</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> wait $!</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">done</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">exit 0</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:14.0pt;color:black;mso-fareast-language:DE">Example: Submitting the test job with <i>sbatch</i>:</span><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">SubmitTime=2018-10-18T15:01:52 EligibleTime=2018-10-18T15:01:52</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">StartTime=2018-10-18T15:01:54 EndTime=2018-10-18T<b><span style="background:yellow">15:03:13</span> </b>Deadline=N/A</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">PreemptTime=2018-10-18T<b><span style="background:yellow">15:02:13</span></b> SuspendTime=None SecsPreSuspend=0</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">job.log:</span><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">Thu Oct 18 15:01:54 CEST 2018: Starting batch job.</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">Thu Oct 18 <span style="background:yellow">15:03:24</span> CEST 2018: Cleaning up...</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">Thu Oct 18 15:03:34 CEST 2018: Done.</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:14.0pt;color:black;mso-fareast-language:DE">Example: Submitting the test job with <i>srun</i>:</span><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">SubmitTime=2018-10-18T15:08:52 EligibleTime=2018-10-18T15:08:52</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">StartTime=2018-10-18T15:08:52 EndTime=2018-10-18T<b>15:09:50 </b>Deadline=N/A</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">PreemptTime=2018-10-18T<b><span style="background:lime">15:09:40</span> </b>SuspendTime=None SecsPreSuspend=0</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">job.log:</span><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">Thu Oct 18 15:08:52 CEST 2018: Starting batch job.</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">Thu Oct 18 <span style="background:lime">15:09:40</span> CEST 2018: Cleaning up...</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">Thu Oct 18 15:09:50 CEST 2018: Done.</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="color:black;mso-fareast-language:DE">Slurm version 17.02.10<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">slurm.conf:</span><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">(…)</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">PreemptType=preempt/qos</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">PreemptMode=CANCEL</span></i></b><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">PartitionName=low-prio Nodes=node[01-09] DefaultTime=01:00:00 MaxTime=24:00:00 DefMemPerCPU=2020 <b>GraceTime=60</b> State=UP QOS=part_gpu</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">(…)</span></i><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span><span lang="EN-US" style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">Is this the intended behavior or am I missing something? It seems that the only way to perform housekeeping from inside a batch script is to use the --signal
option, e.g. --signal=B:TERM@60 or the extra time provided by KillWait. Can anybody confirm?</span><span style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE"> </span><span style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;color:black;mso-fareast-language:DE">Thank you!</span><span style="color:black;mso-fareast-language:DE"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:10.5pt;color:black">---</span><span lang="DE" style="font-size:9.0pt;font-family:"Geneva",sans-serif;color:black"><br>
Universität Bern<br>
Informatikdienste<br>
Gruppe Systemdienste<br>
<br>
Nico Färber<br>
Systemadministrator HPC<br>
<br>
Hochschulstrasse 6<br>
CH-3012 Bern<br>
Tel. +41 (0)31 631 51 89<br>
<br>
mailto: </span><span lang="DE" style="font-size:10.5pt;color:black"><a href="mailto:grid-support@id.unibe.ch" target="_blank"><span style="font-size:9.0pt;font-family:"Geneva",sans-serif;color:blue">grid-support@id.unibe.ch</span></a></span><span lang="DE" style="font-size:9.0pt;font-family:"Geneva",sans-serif;color:black"><br>
</span><span lang="DE" style="font-size:10.5pt;color:black"><a href="http://www.id.unibe.ch/" target="_blank" title="http://www.id.unibe.ch/"><span style="font-size:9.0pt;font-family:"Geneva",sans-serif;color:blue">http://www.id.unibe.ch/</span></a></span><span lang="DE" style="font-size:11.0pt"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:black;mso-fareast-language:DE"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:11.0pt"><o:p> </o:p></span></p>
</div>
</body>
</html>