<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Those would be considered separate for each job.</p>
<p>You may want to have your prolog check to see if there is an
epilogue running and wait for the epilogue to be done before
starting its prolog work.</p>
<p>Brian Andrus<br>
</p>
On 9/27/2021 9:15 AM, Joe Teumer wrote:<br>
<blockquote type="cite"
cite="mid:CAJDZVfvMb6dkwL_1Wxbzu81Cj0ayJ7_6mRT_mObfXy=JAg4xng@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div>Should the Prologslurmctld script only run after
the Epilogslurmctld script finishes?</div>
<div><br>
</div>
<div>Below you can see JobA runs and completes.</div>
<div>While Epilogslurmctld (from JobA Node A) is
executing on the Slurm controller the Prologslurmctld
script for the next job (from Job B Node A) is also
running on the Slurm controller.</div>
<div><br>
</div>
<div>This breaks our workflow as we are expecting the
next Prologslurmctld script to only run when the prior
job is 100% completed (initial
Prologslurmctld completes AND Job completes AND Job
Epilogslurmctld completes).</div>
<div><br>
</div>
<div>Prologslurmctld > Here Prolog is starting for
Job B (ID 812)</div>
<div><b>2021-09-27 15:42:58,746</b> | INFO | Starting...</div>
<div><br>
</div>
<div>
<div>Epilogslurmctld > Here Epilog is starting and
ending for Job A (ID 811)</div>
<div><b>2021-09-27 15:42:56,694</b> | INFO |
Starting...<br>
</div>
</div>
<div><b>2021-09-27 15:43:01,756</b> | INFO | Exiting 0
after main<br>
</div>
<div><br>
</div>
<div>
<div>[2021-09-27T15:42:50.224] debug: sched/backfill:
_attempt_backfill: beginning</div>
<div>[2021-09-27T15:42:50.224] debug: sched/backfill:
_attempt_backfill: 1 jobs to backfill</div>
<div>[2021-09-27T15:42:56.653] _job_complete:
JobId=811 WEXITSTATUS 0</div>
<div>[2021-09-27T15:42:56.653] debug: email msg to
root: Slurm Job_id=811
Name=JobA_BIOS_fixedfreq_1067mclk_nps1.ini Ended,
Run time 00:22:36, COMPLETED, ExitCode 0</div>
<div><b>[2021-09-27T15:42:56.657]</b> _job_complete:
JobId=811 done</div>
<div>[2021-09-27T15:42:58.703] debug: sched: Running
job scheduler for full queue.</div>
<div><b>[2021-09-27T15:42:58.704]</b> debug: email
msg to root: Slurm Job_id=812
Name=JobA_BIOS_fixedfreq_1600mclk_nps1.ini Began,
Queued time 00:22:38</div>
<div>[2021-09-27T15:42:58.704] sched: Allocate
JobId=812 NodeList=xxx #CPUs=256 Partition=xxx</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</body>
</html>