[slurm-users] temporary SLURM directories

Wed May 25 12:42:46 UTC 2022

In addition to the other suggestions, there's this:

https://slurm.schedmd.com/faq.html#tmpfs_jobcontainer
https://slurm.schedmd.com/job_container.conf.html

I would be interested in hearing how well it works - it's so buried in the 
documentation that unfortunately I didn't see it until after I rolled a 
solution similar to Diego's (which can be extended such that TaskProlog 
sets the TMPDIR environment variable appropriately, and limit the disk 
space used by the job).

All the best,

Mark

On Mon, 23 May 2022, Diego Zuccato wrote:

> [EXTERNAL EMAIL]
>
> Hi Arsene.
>
> I did something like that some weeks ago.
>
> I used the lines
> Prolog=/home/conf/Prolog.sh
> TaskProlog=/home/conf/TaskProlog.sh
> Epilog=/home/conf/Epilog.sh
>
> The scripts for prolog and epilog manage the creation (and permissions
> assignment) of a directory in local storage (including the job ID, so
> that different jobs don't get messed up).
>
> TaskProlog script should export an environment variable but I couldn't
> make it work :(
> In your case, TaskProlog should copy the dataset to the local storage
> and then you should add a TaskEpilog script to copy back the result. I
> don't know if the TaskEpilog gets run for aborted jobs.
>
> Moreover, IIRC you shouldn't do slow operations in task prolog or
> epilog, so in your case a state machine implemented as a job array could
> probably be better suited than TaskProlog/TaskEpilog (you'd need
> Prolog/Epilog anyway): the first "job" copies to scratch, the second
> does the number crunching and the third copies back the results.
>
> HIH,
> Diego
>
> Il 23/05/2022 11:30, Arsene Marian Alain ha scritto:
>>  Dear SLURM users,
>>
>>  I am IT Administrator of a small scientific computing center. We
>>  recently installed SLURM as a job scheduler on our Cluster and
>>  everything seems to be working fine. I just have a question about how to
>>  create temporary directories with SLURM.
>>
>>  We use some programs for scientific calculation (such as Gromacs,
>>  Gaussian, NAMD, etc.). So, the process is the following:
>>
>>  When we need to launch a calculation the first step is to copy all the
>>  necessary files from the local "$SLURM_SUBMIT_DIR" directory to the
>>  "/scratch" of the remote node, second step is to access the "/scratch"
>>  of the remote node and then run the program. Finally, when the program
>>  finishes we copy all the output files from the remote node's "/scratch"
>>  back to the local "$SLURM_SUBMIT_DIR" directory.
>>
>>  So, is there any way to automatically generate a temporary directory
>>  inside the "/scratch" of the remote node?
>>
>>  At the moment I am creating that directory manually as follows:
>>
>>  "export HOMEDIR=$SLURM_SUBMIT_DIR
>>
>>  export SCRATCHDIR=/scratch/job.$SLURM_JOB_ID.$USER
>>
>>  export WORKDIR=$SCRATCHDIR
>>
>>  mkdir -p $WORKDIR
>>
>>  cp $HOMEDIR/* $WORKDIR
>>
>>  cd $WORKDIR
>>
>>  $NAMD/namd2 +idlepoll +p11 run_eq.namd > run_eq.log
>>
>>  wait
>>
>>  cp $WORKDIR/* $HOMEDIR"
>>
>>  The main problem when you create the "/scratch" manually is that in case
>>  the calculation ends (successfully or unsuccessfully), users have to
>>  check the "/scratch" and remove the directory manually. I know I could
>>  include a line at the end of my script to delete that directory when the
>>  calculation is done, but I'm sure there must be a better way to do this.
>>
>>  Thanks in advance for the help.
>>
>>  best regards,
>>
>>  Alain
>> 
>
> --
> Diego Zuccato
> DIFA - Dip. di Fisica e Astronomia
> Servizi Informatici
> Alma Mater Studiorum - Università di Bologna
> V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
> tel.: +39 051 20 95786
>
>
>