[slurm-users] job_container/tmpfs and autofs

Ümit Seren uemit.seren at gmail.com
Thu Jan 12 08:23:35 UTC 2023


We had the same issue when we switched to job_container plugin. We ended up
running cvmfs_cpnfig probe as part of the health check tool so that the
cvmfs repos stay mounted. However after we switched on power saving we ran
into some race conditions (job landed on a node before the cvmfs was
mounted). We ended up switching to static mounts for the cvmfs repos on the
compute nodes

Best
Ümit

On Thu, Jan 12, 2023, 09:17 Bjørn-Helge Mevik <b.h.mevik at usit.uio.no> wrote:

> In my opinion, the problem is with autofs, not with tmpfs.  Autofs
> simply doesn't work well when you are using detached fs name spaces and
> bind mounting.  We ran into this problem years ago (with an inhouse
> spank plugin doing more or less what tmpfs does), and ended up simply
> not using autofs.
>
> I guess you could try using systemd's auto-mounting features, but I have
> no idea if they work better than autofs in situations like this.
>
> We ended up using a system where the prolog script mounts any needed
> file systems, and then the healthcheck script unmounts file systems that
> are no longer needed.
>
> --
> Regards,
> Bjørn-Helge Mevik, dr. scient,
> Department for Research Computing, University of Oslo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230112/5f5237f2/attachment-0001.htm>


More information about the slurm-users mailing list