[slurm-users] Staging data on the nodes one will be processing on via sbatch

Sat Apr 3 20:26:18 UTC 2021

Hi,

"scratch space" is generally considered ephemeral storage that only exists
for the duration of the job (It's eligible for deletion in an epilog or
next-job prolog) .

If you've got other fast storage in limited supply that can be used for
data that can be staged, then by all means use it, but consider whether you
want batch cpu cores tied up with the wall time of transferring the data.
This could easily be done on a time-shared frontend login node from which
the users could then submit (via script) jobs after the data was staged.
Most of the transfer wallclock is in network wait, so don't waste dedicated
cores for it.

On Sat, Apr 3, 2021 at 4:13 PM Will Dennis <wdennis at nec-labs.com> wrote:

> What I mean by “scratch” space is indeed local persistent storage in our
> case; sorry if my use of “scratch space” is already a generally-known Slurm
> concept I don’t understand, or something like /tmp… That’s why my desired
> workflow is to “copy data locally / use data from copy / remove local copy”
> in separate steps.
>
>
>
>
>
> *From: *slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of
> Fulcomer, Samuel <samuel_fulcomer at brown.edu>
> *Date: *Saturday, April 3, 2021 at 4:00 PM
> *To: *Slurm User Community List <slurm-users at lists.schedmd.com>
> *Subject: *Re: [slurm-users] Staging data on the nodes one will be
> processing on via sbatch
>
> […]
>
> The best current workflow is to stage data into fast local persistent
> storage, and then to schedule jobs, or schedule a job that does it
> synchronously (TImeLimit=Stage+Compute). The latter is pretty unsocial and
> wastes cycles.
>
> […]
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210403/fc267d70/attachment.htm>