[slurm-users] Cleanup of job_container/tmpfs

Michael Jennings mej at lanl.gov
Wed Mar 1 22:15:34 UTC 2023


On Wednesday, 01 March 2023, at 10:28:24 (+0100),
Ole Holm Nielsen wrote:


> but there may be some significant improvements included in 23.02 

TL;DR:  I can vouch for this.

The primary problem with the interaction between the new namespace
code and the automounter daemon was simply that the shared subtree
flags on the root mount within the Slurm-built mount namespace were
not selected with things like autofs in mind.  It was marked private
(no mounts out, and no mounts in) rather than shared+slave (no mounts out,
but mounts *do* come in); once that was found and fixed, autofs and
job containers played nicely again.

There were a couple other Oopsies with regard to mount/unmount
operations that were addressed at the same time.  If you're curious
for more detail, you can follow the links in this comment on our bug
report:

https://bugs.schedmd.com/show_bug.cgi?id=12567#c47

Happy Slurming!
Michael

-- 
Michael E. Jennings (he/him) <mej at lanl.gov>            https://hpc.lanl.gov/
HPC Platform Integration Engineer - Platforms Design Team - HPC Design Group
Ultra-Scale Research Center (USRC), 4200 W Jemez #301-25   +1 (505) 606-0605
Los Alamos National Laboratory,  P.O. Box 1663,  Los Alamos, NM   87545-0001



More information about the slurm-users mailing list