[slurm-users] temporary SLURM directories

Mon May 23 10:56:53 UTC 2022

Hi Arsene.

I did something like that some weeks ago.

I used the lines
Prolog=/home/conf/Prolog.sh
TaskProlog=/home/conf/TaskProlog.sh
Epilog=/home/conf/Epilog.sh

The scripts for prolog and epilog manage the creation (and permissions 
assignment) of a directory in local storage (including the job ID, so 
that different jobs don't get messed up).

TaskProlog script should export an environment variable but I couldn't 
make it work :(
In your case, TaskProlog should copy the dataset to the local storage 
and then you should add a TaskEpilog script to copy back the result. I 
don't know if the TaskEpilog gets run for aborted jobs.

Moreover, IIRC you shouldn't do slow operations in task prolog or 
epilog, so in your case a state machine implemented as a job array could 
probably be better suited than TaskProlog/TaskEpilog (you'd need 
Prolog/Epilog anyway): the first "job" copies to scratch, the second 
does the number crunching and the third copies back the results.

HIH,
Diego

Il 23/05/2022 11:30, Arsene Marian Alain ha scritto:
> Dear SLURM users,
> 
> I am IT Administrator of a small scientific computing center. We 
> recently installed SLURM as a job scheduler on our Cluster and 
> everything seems to be working fine. I just have a question about how to 
> create temporary directories with SLURM.
> 
> We use some programs for scientific calculation (such as Gromacs, 
> Gaussian, NAMD, etc.). So, the process is the following:
> 
> When we need to launch a calculation the first step is to copy all the 
> necessary files from the local "$SLURM_SUBMIT_DIR" directory to the 
> "/scratch" of the remote node, second step is to access the "/scratch" 
> of the remote node and then run the program. Finally, when the program 
> finishes we copy all the output files from the remote node's "/scratch" 
> back to the local "$SLURM_SUBMIT_DIR" directory.
> 
> So, is there any way to automatically generate a temporary directory 
> inside the "/scratch" of the remote node?
> 
> At the moment I am creating that directory manually as follows:
> 
> "export HOMEDIR=$SLURM_SUBMIT_DIR
> 
> export SCRATCHDIR=/scratch/job.$SLURM_JOB_ID.$USER
> 
> export WORKDIR=$SCRATCHDIR
> 
> mkdir -p $WORKDIR
> 
> cp $HOMEDIR/* $WORKDIR
> 
> cd $WORKDIR
> 
> $NAMD/namd2 +idlepoll +p11 run_eq.namd > run_eq.log
> 
> wait
> 
> cp $WORKDIR/* $HOMEDIR"
> 
> The main problem when you create the "/scratch" manually is that in case 
> the calculation ends (successfully or unsuccessfully), users have to 
> check the "/scratch" and remove the directory manually. I know I could 
> include a line at the end of my script to delete that directory when the 
> calculation is done, but I'm sure there must be a better way to do this.
> 
> Thanks in advance for the help.
> 
> best regards,
> 
> Alain
> 

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786