[slurm-users] Cleanup of job_container/tmpfs

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Tue Mar 7 08:19:39 UTC 2023


Hi Brian,

Presumably the users' home directory is NFS automounted using autofs, and 
therefore it doesn't exist when the job starts.

The job_container/tmpfs plugin ought to work correctly with autofs, but 
maybe this is still broken in 23.02?

/Ole


On 3/6/23 21:06, Brian Andrus wrote:
> That looks like the users' home directory doesn't exist on the node.
> 
> If you are not using a shared home for the nodes, your onboarding process 
> should be looked at to ensure it can handle any issues that may arise.
> 
> If you are using a shared home, you should do the above and have the node 
> ensure the shared filesystems are mounted before allowing jobs.
> 
> -Brian Andrus
> 
> On 3/6/2023 1:15 AM, Niels Carl W. Hansen wrote:
>> Hi all
>>
>> Seems there still are some issues with the autofs - job_container/tmpfs 
>> functionality in Slurm 23.02.
>> If the required directories aren't mounted on the allocated node(s) 
>> before jobstart, we get:
>>
>> slurmstepd: error: couldn't chdir to `/users/lutest': No such file or 
>> directory: going to /tmp instead
>> slurmstepd: error: couldn't chdir to `/users/lutest': No such file or 
>> directory: going to /tmp instead
>>
>> An easy workaround however, is to include this line in the slurm prolog 
>> on the slurmd -nodes:
>>
>> /usr/bin/su - $SLURM_JOB_USER -c /usr/bin/true
>>
>> -but there might exist a better way to solve the problem?



More information about the slurm-users mailing list