[slurm-users] Get Job Array information in Epilog script
Timo Rothenpieler
timo.rothenpieler at uni-bremen.de
Fri Mar 17 11:15:33 UTC 2023
Hello!
I'm currently facing a bit of an issue regarding cleanup after a job
completed.
I've added the following bit of Shellscript to our clusters Epilog script:
> for d in "${SLURM_JOB_ID}" "${SLURM_JOB_ID}_${SLURM_ARRAY_TASK_ID}" "${SLURM_ARRAY_JOB_ID}_${SLURM_ARRAY_TASK_ID}"; do
> WORKDIR="/work/${SLURM_JOB_USER}/${d}"
> if [ -e "${WORKDIR}" ]; then
> rm -rf "${WORKDIR}"
> fi
> done
However, it did not end up working to clean up working directories of
Array-Jobs.
After some investigation, I found the reason in the documentation:
> SLURM_ARRAY_JOB_ID/SLURM_ARRAY_TASK_ID: [...]
> Available in PrologSlurmctld, SrunProlog, TaskProlog,
EpilogSlurmctld, SrunEpilog and TaskEpilog.
So, now I wonder... how am I supposed to get that information in the
Epilog script? The whole job is part of an array, so how do I get the
information at a job level?
The "obvious alternative" based on that documentation would be to put
that bit of code into a TaskEpilog script. But my understanding of that
is that the script would run after each one of potentially multiple
srun-launched tasks in the same job, and would then clean up the
work-dir while the job would still use it?
I only want to do that bit of cleanup when the job is terminating.
More information about the slurm-users
mailing list