<div dir="ltr">Fantastic, this is really helpful, thanks!</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Sep 28, 2023 at 12:05 PM Paul Edmon <<a href="mailto:pedmon@cfa.harvard.edu" target="_blank">pedmon@cfa.harvard.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div>
    <p>Yes it was later than that. If you are 23.02 you are good.  We've
      been running with storing job_scripts on for years at this point
      and that part of the database only uses up 8.4G.  Our entire
      database takes up 29G on disk. So its about 1/3 of the database. 
      We also have database compression which helps with the on disk
      size. Raw uncompressed our database is about 90G.  We keep 6
      months of data in our active database.<br>
    </p>
    <p>-Paul Edmon-<br>
    </p>
    <div>On 9/28/2023 1:57 PM, Ryan Novosielski
      wrote:<br>
    </div>
    <blockquote type="cite">
      
      Sorry for the duplicate e-mail in a short time: do you know (or
      anyone) when the hashing was added? Was planning to enable this on
      21.08, but we then had to delay our upgrade to it. I’m assuming
      later than that, as I believe that’s when the feature was added.
      <div><br>
        <blockquote type="cite">
          <div>On Sep 28, 2023, at 13:55, Ryan Novosielski
            <a href="mailto:novosirj@rutgers.edu" target="_blank"><novosirj@rutgers.edu></a> wrote:</div>
          <br>
          <div>
            <div>
              Thank you; we’ll put in a feature request for improvements
              in that area, and also thanks for the warning? I thought
              of that in passing, but the real world experience is
              really useful. I could easily see wanting that stuff to be
              retained less often than the main records, which is what
              I’d ask for.
              <div><br>
              </div>
              <div>I assume that archiving, in general, would also
                remove this stuff, since old jobs themselves will be
                removed?</div>
              <div><br>
                <div>
                  <div>
                    <div dir="auto" style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
                      <div dir="auto" style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
                        <div dir="auto" style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
                          <div dir="auto" style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
                            <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
                              --<br>
                              #BlackLivesMatter</div>
                            <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
                              ____<br>
                              || \\UTGERS,    
                              |---------------------------*O*---------------------------<br>
                              ||_// the State<span style="white-space:pre-wrap"> </span> |
                                      Ryan Novosielski
                              - <a href="mailto:novosirj@rutgers.edu" target="_blank">novosirj@rutgers.edu</a><br>
                              || \\ University | Sr. Technologist
                              - 973/972.0922 (2x0922) ~*~ RBHS Campus<br>
                              ||  \\    of NJ<span style="white-space:pre-wrap"> </span> |
                              Office of Advanced Research Computing -
                              MSB A555B, Newark<br>
                                   `'</div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                  <div><br>
                    <blockquote type="cite">
                      <div>On Sep 28, 2023, at 13:48, Paul Edmon
                        <a href="mailto:pedmon@cfa.harvard.edu" target="_blank"><pedmon@cfa.harvard.edu></a> wrote:</div>
                      <br>
                      <div>
                        <div>Slurm should take care of it when you add
                          it.<br>
                          <br>
                          So far as horror stories, under previous
                          versions our database size ballooned to be so
                          massive that it actually prevented us from
                          upgrading and we had to drop the columns
                          containing the job_script and job_env.  This
                          was back before slurm started hashing the
                          scripts so that it would only store one copy
                          of duplicate scripts.  After this point we
                          found that the job_script database stayed at a
                          fairly reasonable size as most users use
                          functionally the same script each time.
                          However the job_env continued to grow like
                          crazy as there are variables in our
                          environment that change fairly consistently
                          depending on where the user is. Thus job_envs
                          ended up being too massive to keep around and
                          so we had to drop them. Frankly we never
                          really used them for debugging. The
                          job_scripts though are super useful and not
                          that much overhead.<br>
                          <br>
                          In summary my recommendation is to only store
                          job_scripts. job_envs add too much storage for
                          little gain, unless your job_envs are
                          basically the same for each user in each
                          location.<br>
                          <br>
                          Also it should be noted that there is no way
                          to prune out job_scripts or job_envs right
                          now. So the only way to get rid of them if
                          they get large is to 0 out the column in the
                          table. You can ask SchedMD for the mysql
                          command to do this as we had to do it here to
                          our job_envs.<br>
                          <br>
                          -Paul Edmon-<br>
                          <br>
                          On 9/28/2023 1:40 PM, Davide DelVento wrote:<br>
                          <blockquote type="cite">In my current slurm
                            installation, (recently upgraded to slurm
                            v23.02.3), I only have<br>
                            <br>
                            AccountingStoreFlags=job_comment<br>
                            <br>
                            I now intend to add both<br>
                            <br>
                            AccountingStoreFlags=job_script<br>
                            AccountingStoreFlags=job_env<br>
                            <br>
                            leaving the default 4MB value
                            for max_script_size<br>
                            <br>
                            Do I need to do anything on the DB myself,
                            or will slurm take care of the additional
                            tables if needed?<br>
                            <br>
                            Any
                            comments/suggestions/gotcha/pitfalls/horror_stories
                            to share? I know about the additional
                            diskspace and potentially load needed, and
                            with our resources and typical workload I
                            should be okay with that.<br>
                            <br>
                            Thanks!<br>
                          </blockquote>
                          <br>
                        </div>
                      </div>
                    </blockquote>
                  </div>
                  <br>
                </div>
              </div>
            </div>
          </div>
        </blockquote>
      </div>
      <br>
    </blockquote>
  </div>

</blockquote></div>