<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>That looks like the users' home directory doesn't exist on the
      node.</p>
    <p>If you are not using a shared home for the nodes, your onboarding
      process should be looked at to ensure it can handle any issues
      that may arise.</p>
    <p>If you are using a shared home, you should do the above and have
      the node ensure the shared filesystems are mounted before allowing
      jobs.</p>
    <p>-Brian Andrus<br>
    </p>
    <div class="moz-cite-prefix">On 3/6/2023 1:15 AM, Niels Carl W.
      Hansen wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:02106b37-6f09-e5fe-9c5c-43e4d7f1ab78@cscaa.dk">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      Hi all<br>
      <br>
      Seems there still are some issues with the autofs -
      job_container/tmpfs functionality in Slurm 23.02.<br>
      If the required directories aren't mounted on the allocated
      node(s) before jobstart, we get:<br>
      <br>
      <font size="-1" face="Courier New, Courier, monospace">slurmstepd:
        error: couldn't chdir to `/users/lutest': No such file or
        directory: going to /tmp instead<br>
        slurmstepd: error: couldn't chdir to `/users/lutest': No such
        file or directory: going to /tmp instead</font><br>
      <br>
      An easy workaround however, is to include this line in the slurm
      prolog on the slurmd -nodes:<br>
      <font size="-1" face="Courier New, Courier, monospace"><br>
        /usr/bin/su - $SLURM_JOB_USER -c /usr/bin/true</font><br>
      <br>
      -but there might exist a better way to solve the problem?<br>
      <br>
      Best<br>
      Niels Carl<br>
      <br>
      <br>
      <br>
      <br>
      <br>
      <div class="moz-cite-prefix">On 3/2/23 12:27 AM, Jason Ellul
        wrote:<br>
      </div>
      <blockquote type="cite"
cite="mid:ME3P282MB271271238E9AD2C9923959B39CAD9@ME3P282MB2712.AUSP282.PROD.OUTLOOK.COM">
        <meta http-equiv="Content-Type" content="text/html;
          charset=UTF-8">
        <meta name="Generator" content="Microsoft Word 15 (filtered
          medium)">
        <style>@font-face
        {font-family:Helvetica;
        panose-1:0 0 0 0 0 0 0 0 0 0;}@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        font-size:10.0pt;
        font-family:"Calibri",sans-serif;}a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}</style>
        <div class="WordSection1">
          <p class="MsoNormal"><span
              style="font-size:11.0pt;mso-fareast-language:EN-US">Thanks
              so much Ole for the info and link,<o:p></o:p></span></p>
          <p class="MsoNormal"><span
              style="font-size:11.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
          <p class="MsoNormal"><span
              style="font-size:11.0pt;mso-fareast-language:EN-US">Your
              documentation is extremely useful.<o:p></o:p></span></p>
          <p class="MsoNormal"><span
              style="font-size:11.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
          <p class="MsoNormal"><span
              style="font-size:11.0pt;mso-fareast-language:EN-US">Prior
              to moving to 22.05 we had been using
              slurm-spank-private-tmpdir with an epilog to clean-up the
              folders on job completion, but we were hoping to move to
              the inbuilt functionality to ensure future compatibility
              and reduce complexity.<o:p></o:p></span></p>
          <p class="MsoNormal"><span
              style="font-size:11.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
          <p class="MsoNormal"><span
              style="font-size:11.0pt;mso-fareast-language:EN-US">Will
              try 23.02 and if that does not resolve our issue consider
              moving back to slurm-spank-private-tmpdir or auto_tmpdir.<o:p></o:p></span></p>
          <p class="MsoNormal"><span
              style="font-size:11.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
          <p class="MsoNormal"><span
              style="font-size:11.0pt;mso-fareast-language:EN-US">Thanks
              again,<o:p></o:p></span></p>
          <p class="MsoNormal"><span
              style="font-size:11.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
          <p class="MsoNormal"><span
              style="font-size:11.0pt;mso-fareast-language:EN-US">Jason<o:p></o:p></span></p>
          <p class="MsoNormal"><span
              style="font-size:11.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
          <p class="MsoNormal"><span
style="font-size:9.0pt;font-family:Helvetica;color:black;background:white">Jason
              Ellul</span><span
              style="font-size:9.0pt;font-family:Helvetica;color:black"><br>
              <span style="background:white">Head - Research Computing
                Facility</span><br>
              <span style="background:white">Office of Cancer Research</span><br>
              <span style="background:white">Peter MacCallum Cancer
                Center<o:p></o:p></span></span></p>
          <p class="MsoNormal"><span
style="font-size:9.0pt;font-family:Helvetica;color:black;background:white"><o:p> </o:p></span></p>
          <p class="MsoNormal"><span
              style="font-size:11.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
          <div style="border:none;border-top:solid #B5C4DF
            1.0pt;padding:3.0pt 0cm 0cm 0cm">
            <p class="MsoNormal" style="margin-bottom:12.0pt"><b><span
                  style="font-size:12.0pt;color:black">From: </span></b><span
                style="font-size:12.0pt;color:black">slurm-users <a
                  class="moz-txt-link-rfc2396E"
                  href="mailto:slurm-users-bounces@lists.schedmd.com"
                  moz-do-not-send="true"><slurm-users-bounces@lists.schedmd.com></a>
                on behalf of Ole Holm Nielsen <a
                  class="moz-txt-link-rfc2396E"
                  href="mailto:Ole.H.Nielsen@fysik.dtu.dk"
                  moz-do-not-send="true"><Ole.H.Nielsen@fysik.dtu.dk></a><br>
                <b>Date: </b>Wednesday, 1 March 2023 at 8:29 pm<br>
                <b>To: </b><a class="moz-txt-link-abbreviated
                  moz-txt-link-freetext"
                  href="mailto:slurm-users@lists.schedmd.com"
                  moz-do-not-send="true">slurm-users@lists.schedmd.com</a>
                <a class="moz-txt-link-rfc2396E"
                  href="mailto:slurm-users@lists.schedmd.com"
                  moz-do-not-send="true"><slurm-users@lists.schedmd.com></a><br>
                <b>Subject: </b>Re: [slurm-users] Cleanup of
                job_container/tmpfs<o:p></o:p></span></p>
          </div>
          <div>
            <p class="MsoNormal" style="margin-bottom:12.0pt"><span
                style="font-size:11.0pt">! EXTERNAL EMAIL: Think before
                you click. If suspicious send to <a
                  class="moz-txt-link-abbreviated moz-txt-link-freetext"
                  href="mailto:CyberReport@petermac.org"
                  moz-do-not-send="true">CyberReport@petermac.org</a><br>
                <br>
                Hi Jason,<br>
                <br>
                IMHO, the job_container/tmpfs is not working well in
                Slurm 22.05, but<br>
                there may be some significant improvements included in
                23.02 (announced<br>
                yesterday).  I've documented our experiences in the Wiki
                page<br>
              </span><a
href="https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_configuration/#temporary-job-directories"
                moz-do-not-send="true"><span style="font-size:11.0pt">https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_configuration/#temporary-job-directories</span></a><span
                style="font-size:11.0pt"><br>
                This page contains links to bug reports against the
                job_container/tmpfs<br>
                plugin.<br>
                <br>
                We're using the auto_tmpdir SPANK plugin with great
                success in Slurm 22.05.<br>
                <br>
                Best regards,<br>
                Ole<br>
                <br>
                <br>
                On 01-03-2023 03:27, Jason Ellul wrote:<br>
                > We have recently moved to slurm 22.05.8 and have
                configured<br>
                > job_container/tmpfs to allow private tmp folders.<br>
                ><br>
                > job_container.conf contains:<br>
                ><br>
                > AutoBasePath=true<br>
                ><br>
                > BasePath=/slurm<br>
                ><br>
                > And in slurm.conf we have set<br>
                ><br>
                > JobContainerType=job_container/tmpfs<br>
                ><br>
                > I can see the folders being created and they are
                being used but when a<br>
                > job completes the root folder is not being cleaned
                up.<br>
                ><br>
                > Example of running job:<br>
                ><br>
                > [root@papr-res-compute204 ~]# ls -al
                /slurm/14292874<br>
                ><br>
                > total 32<br>
                ><br>
                > drwx------   3 root      root    34 Mar  1 13:16 .<br>
                ><br>
                > drwxr-xr-x 518 root      root 16384 Mar  1 13:16 ..<br>
                ><br>
                > drwx------   2 mzethoven root     6 Mar  1 13:16
                .14292874<br>
                ><br>
                > -r--r--r--   1 root      root     0 Mar  1 13:16
                .ns<br>
                ><br>
                > Example once job completes /slurm/<jobid>
                remains:<br>
                ><br>
                > [root@papr-res-compute204 ~]# ls -al
                /slurm/14292794<br>
                ><br>
                > total 32<br>
                ><br>
                > drwx------   2 root root     6 Mar  1 09:33 .<br>
                ><br>
                > drwxr-xr-x 518 root root 16384 Mar  1 13:16 ..<br>
                ><br>
                > Is this to be expected or should the folder
                /slurm/<jobid> also be removed?<br>
                ><br>
                > Do I need to create an epilog script to remove the
                directory that is left?<o:p></o:p></span></p>
          </div>
        </div>
        <p
style="margin:0cm;margin-bottom:.0001pt;font-size:15px;font-family:"Calibri",sans-serif;"><span
            style="font-size: 12px;"><br>
          </span></p>
        <p
style="margin:0cm;margin-bottom:.0001pt;font-size:15px;font-family:"Calibri",sans-serif;"><span
            style="font-size: 12px;"><strong>Disclaimer: </strong>This
            email (including any attachments or links) may contain
            confidential and/or legally privileged information and is
            intended only to be read or used by the addressee. If you
            are not the intended addressee, any use, distribution,
            disclosure or copying of this email is strictly prohibited.
            Confidentiality and legal privilege attached to this email
            (including any attachments) are not waived or lost by reason
            of its mistaken delivery to you. If you have received this
            email in error, please delete it and notify us immediately
            by telephone or email. Peter MacCallum Cancer Centre
            provides no guarantee that this transmission is free of
            virus or that it has not been intercepted or altered and
            will not be liable for any delay in its receipt. </span></p>
      </blockquote>
      <br>
    </blockquote>
  </body>
</html>