[slurm-users] Slurm and shared file systems
Steven Dick
kg4ydw at gmail.com
Fri Jun 19 12:34:31 UTC 2020
Condor's original premise was to have long running compute jobs on
distributed nodes with no shared filesystem.
Of course, they played all kinds of dirty tricks to make this work
including intercepted libc and system calls.
I see no reason cleverly wrapped slurm jobs coudln't do the same,
either prestaging files that normally would be on a shared fs or
wrapping calls or maybe even using parts of condor. But slurm wasn't
designed to do this, so you'd have to add your own mechanisms for it.
On Fri, Jun 19, 2020 at 8:20 AM Riebs, Andy <andy.riebs at hpe.com> wrote:
>
> David,
>
>
>
> I’ve been using Slurm for nearly 20 years, and while I can imagine some clever work-arounds, like staging your job in /var/tmp on all of the nodes before trying to run it, it’s hard to imagine a cluster serving a useful purpose without a shared user file system, whether or not Slurm is involved.
>
>
>
> Having said that, I hope that someone comes up with a real use case to help me see something that I don’t currently see!
>
>
>
> Andy
>
>
>
> From: slurm-users [mailto:slurm-users-bounces at lists.schedmd.com] On Behalf Of David Baker
> Sent: Friday, June 19, 2020 8:05 AM
> To: Slurm User Community List <slurm-users at lists.schedmd.com>
> Subject: [slurm-users] Slurm and shared file systems
>
>
>
> Hello,
>
>
>
> We are currently helping a research group to set up their own Slurm cluster. They have asked a very interesting question about Slurm and file systems. That is, they are posing the question -- do you need a shared user file store on a Slurm cluster?
>
>
>
> So, in the extreme case where this is no shared file store for users can slurm operate properly over a cluster? I have seen commands like sbcast to move a file from the submission node to a compute node, however that command can only transfer one file at a time. Furthermore what would happen to the standard output files? I'm going to guess that there must be a shared file system, however it would be good if someone could please confirm this.
>
>
>
> Best regards,
>
> David
>
>
>
>
More information about the slurm-users
mailing list