[slurm-users] Clean Up Scratch After Failed Job

Tue Oct 10 16:20:29 UTC 2023

Hi,

On one of our clusters the "Epilog" setup in slurm.conf, (Epilog=/etc/slurm/slurm.epilog.local) call's the following to run the tmpwatch utility with a tiny access time on /tmp. I think tmpwatch can be run on specified paths and not just /tmp.

#####################################
# First, clean out /tmp, ruthlessly #
#####################################
# 2022-04-20, smcgrat at tcd.ie, RT#25239, RT#25262
# this will delete anything in /tmp older than a minute

outfile=/tmp/epilog.$CLUSTER.$(date +%Y%m%d%H%M%S).txt
{
echo "------------------------------------------------------------------"
echo -n "* /tmp maintenance - "
date
echo -n "* Current usage"
echo ""
df -h /tmp

echo -n "* Running tmpwatch to delete anything in /tmp older than a minute"
/usr/sbin/tmpwatch --atime 1m /tmp
echo ""

echo -n "* /tmp usage now"
echo ""
df -h /tmp

echo ""
} >> "${outfile}"

The last thing our epilog setup does is run the standard /etc/slurm/slurm.epilog.clean.dist. Looking at that it uses the SLURM_UID and SLURM_JOB_ID variables. I would guess that it also has access to the other variables in the context.

Best

Sean

---
Sean McGrath
Senior Systems Administrator, IT Services

________________________________
From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Jason Simms <jsimms1 at swarthmore.edu>
Sent: Tuesday 10 October 2023 16:59
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: [slurm-users] Clean Up Scratch After Failed Job

Hello all,

Our template scripts for Slurm include a workflow to copy files to a scratch space prior to running a job, and then copying any output files, etc. back to the original submit directory on job completion, and then finally cleaning up (deleting) the scratch space before exiting. This works great until a job fails or is requeued, in which case the scratch space isn't cleaned up.

In the past, I've run a cron job that deletes any material in scratch that hasn't been modified in any number of days beyond the max length of a job, but that can still allow "zombie" material to remain in scratch for quite a while. I'm intrigued by using an epilog script that is triggered after each job completes (whether normally or due to failure, requeuing, etc.) to accomplish the same task more efficiently and consistently.

The first question is in which context would I run the epilog. I presume I'd want to run it after a job completes entirely, so looking at the table, I think I'd want an Epilog script to run on the compute node. Reading the documentation, however, it is unclear to me that all variables I would need will be available in such a script. We use the variables $USER, $SLURM_JOB_NAME, and $SLURM_JOB_ID to create a path within scratch unique to each job.

Specifically, however, the documentation for $SLURM_JOB_NAME says:

"SLURM_JOB_NAME Name of the job. Available in PrologSlurmctld, SrunProlog, TaskProlog, EpilogSlurmctld, SrunEpilog and TaskEpilog."

So it doesn't seem to be available in the appropriate context. Thinking about it, however, I presume if I only use the $SLURM_JOB_ID and $USER (and then $SLURM_JOB_USER in the epilog script) that the path would still be unique; meaning, I could just not use the job name.

Anyway, if anyone has any thoughts or examples of setting up something like this, I'd appreciate it!

Warmest regards,
Jason

--
Jason L. Simms, Ph.D., M.P.H.
Manager of Research Computing
Swarthmore College
Information Technology Services
(610) 328-8102
Schedule a meeting: https://calendly.com/jlsimms
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20231010/daf6c9fa/attachment.htm>