<div dir="ltr">Hi All,<div><br></div><div>I've encountered what I think is a bug with srun's exit status when a timeout occurs, but perhaps my expectation is off. My expectation is for srun to have a non-zero exit status when a timeout occurs before all tasks can complete.</div><div><br></div><div>This behaves as expected when all tasks are timed out:</div><div><span style="color:rgb(0,0,0);font-family:monospace;font-size:medium;white-space:pre-wrap"><br></span></div><div><span style="color:rgb(0,0,0);font-family:monospace;font-size:medium;white-space:pre-wrap"> > srun --time 1 --ntasks=2 perl -e 'sleep 120 + 120 * $ENV{SLURM_PROCID}'; echo "status: $?"</span><br></div><div><pre class="gmail-bz_comment_text" style="font-size:medium;font-family:monospace;white-space:pre-wrap;width:50em;color:rgb(0,0,0);font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial"> srun: Force Terminated job 2392836
srun: Job step aborted: Waiting up to 62 seconds for job step to finish.
slurmstepd: error: *** STEP 2392836.0 ON foo0205 CANCELLED AT 2018-04-19T18:33:34 DUE TO TIME LIMIT ***
srun: error: foo0205: tasks 0-1: Terminated
status: 143</pre>
However, when some tasks complete, while others are timed out, srun always exits with a zero status. This is not what I expect, since tasks were forcefully terminated:</div><div><span style="color:rgb(0,0,0);font-family:monospace;font-size:medium;white-space:pre-wrap"><br></span></div><div><span style="color:rgb(0,0,0);font-family:monospace;font-size:medium;white-space:pre-wrap"> > srun --time 3 --ntasks=2 perl -e 'sleep 120 + 120 * $ENV{SLURM_PROCID}'; echo "status: $?"</span><br></div><div><pre class="gmail-bz_comment_text" style="font-size:medium;font-family:monospace;white-space:pre-wrap;width:50em;color:rgb(0,0,0);font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial"> srun: Force Terminated job 2392845
srun: Job step aborted: Waiting up to 62 seconds for job step to finish.
slurmstepd: error: *** STEP 2392845.0 ON foo3009 CANCELLED AT 2018-04-19T18:37:04 DUE TO TIME LIMIT ***
srun: error: foo3009: task 1: Terminated
status: 0</pre>
Is my expectation off, or does this look like a genuine bug?</div><div><br></div><div>Thanks,</div><div><br></div><div> - Dan </div></div>