[slurm-users] [ext] Re: Cleanup of job_container/tmpfs

Hagdorn, Magnus Karl Moritz magnus.hagdorn at charite.de
Tue Mar 7 14:13:13 UTC 2023


I just upgrade slurm to 23.02 on our test cluster to try out the new
job_container/tmpfs stuff. I can confirm it works with autofs (hurrah!)
but you need to set the Shared=true option in the job_container.conf
file.
Cheers
magnus

On Tue, 2023-03-07 at 09:19 +0100, Ole Holm Nielsen wrote:
> Hi Brian,
> 
> Presumably the users' home directory is NFS automounted using autofs,
> and 
> therefore it doesn't exist when the job starts.
> 
> The job_container/tmpfs plugin ought to work correctly with autofs,
> but 
> maybe this is still broken in 23.02?
> 
> /Ole
> 
> 
> On 3/6/23 21:06, Brian Andrus wrote:
> > That looks like the users' home directory doesn't exist on the
> > node.
> > 
> > If you are not using a shared home for the nodes, your onboarding
> > process 
> > should be looked at to ensure it can handle any issues that may
> > arise.
> > 
> > If you are using a shared home, you should do the above and have
> > the node 
> > ensure the shared filesystems are mounted before allowing jobs.
> > 
> > -Brian Andrus
> > 
> > On 3/6/2023 1:15 AM, Niels Carl W. Hansen wrote:
> > > Hi all
> > > 
> > > Seems there still are some issues with the autofs -
> > > job_container/tmpfs 
> > > functionality in Slurm 23.02.
> > > If the required directories aren't mounted on the allocated
> > > node(s) 
> > > before jobstart, we get:
> > > 
> > > slurmstepd: error: couldn't chdir to `/users/lutest': No such
> > > file or 
> > > directory: going to /tmp instead
> > > slurmstepd: error: couldn't chdir to `/users/lutest': No such
> > > file or 
> > > directory: going to /tmp instead
> > > 
> > > An easy workaround however, is to include this line in the slurm
> > > prolog 
> > > on the slurmd -nodes:
> > > 
> > > /usr/bin/su - $SLURM_JOB_USER -c /usr/bin/true
> > > 
> > > -but there might exist a better way to solve the problem?
> 

-- 
Magnus Hagdorn
Charité – Universitätsmedizin Berlin
Geschäftsbereich IT | Scientific Computing
 
Campus Charité Virchow Klinikum
Forum 4 | Ebene 02 | Raum 2.020
Augustenburger Platz 1
13353 Berlin
 
magnus.hagdorn at charite.de
https://www.charite.de
HPC Helpdesk: sc-hpc-helpdesk at charite.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5449 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230307/f6412f54/attachment.bin>


More information about the slurm-users mailing list