[slurm-users] Get Job Array information in Epilog script

William Brown william at signalbox.org.uk
Fri Mar 17 12:11:10 UTC 2023


We create the temporary directories using SLURM_JOB_ID, and that works
fine with Job Arrays so far as I can see.   Don't you have a problem
if a user has multiple jobs on the same node?

William

On Fri, 17 Mar 2023 at 11:17, Timo Rothenpieler
<timo.rothenpieler at uni-bremen.de> wrote:
>
> Hello!
>
> I'm currently facing a bit of an issue regarding cleanup after a job
> completed.
>
> I've added the following bit of Shellscript to our clusters Epilog script:
>
> > for d in "${SLURM_JOB_ID}" "${SLURM_JOB_ID}_${SLURM_ARRAY_TASK_ID}" "${SLURM_ARRAY_JOB_ID}_${SLURM_ARRAY_TASK_ID}"; do
> >         WORKDIR="/work/${SLURM_JOB_USER}/${d}"
> >         if [ -e "${WORKDIR}" ]; then
> >                 rm -rf "${WORKDIR}"
> >         fi
> > done
>
> However, it did not end up working to clean up working directories of
> Array-Jobs.
>
> After some investigation, I found the reason in the documentation:
>
>  > SLURM_ARRAY_JOB_ID/SLURM_ARRAY_TASK_ID: [...]
>  > Available in PrologSlurmctld, SrunProlog, TaskProlog,
> EpilogSlurmctld, SrunEpilog and TaskEpilog.
>
> So, now I wonder... how am I supposed to get that information in the
> Epilog script? The whole job is part of an array, so how do I get the
> information at a job level?
>
> The "obvious alternative" based on that documentation would be to put
> that bit of code into a TaskEpilog script. But my understanding of that
> is that the script would run after each one of potentially multiple
> srun-launched tasks in the same job, and would then clean up the
> work-dir while the job would still use it?
>
> I only want to do that bit of cleanup when the job is terminating.
>



More information about the slurm-users mailing list