[slurm-users] enabling job script archival
Paul Edmon
pedmon at cfa.harvard.edu
Thu Sep 28 18:03:27 UTC 2023
Yes it was later than that. If you are 23.02 you are good. We've been
running with storing job_scripts on for years at this point and that
part of the database only uses up 8.4G. Our entire database takes up
29G on disk. So its about 1/3 of the database. We also have database
compression which helps with the on disk size. Raw uncompressed our
database is about 90G. We keep 6 months of data in our active database.
-Paul Edmon-
On 9/28/2023 1:57 PM, Ryan Novosielski wrote:
> Sorry for the duplicate e-mail in a short time: do you know (or
> anyone) when the hashing was added? Was planning to enable this on
> 21.08, but we then had to delay our upgrade to it. I’m assuming later
> than that, as I believe that’s when the feature was added.
>
>> On Sep 28, 2023, at 13:55, Ryan Novosielski <novosirj at rutgers.edu> wrote:
>>
>> Thank you; we’ll put in a feature request for improvements in that
>> area, and also thanks for the warning? I thought of that in passing,
>> but the real world experience is really useful. I could easily see
>> wanting that stuff to be retained less often than the main records,
>> which is what I’d ask for.
>>
>> I assume that archiving, in general, would also remove this stuff,
>> since old jobs themselves will be removed?
>>
>> --
>> #BlackLivesMatter
>> ____
>> || \\UTGERS, |---------------------------*O*---------------------------
>> ||_// the State | Ryan Novosielski - novosirj at rutgers.edu
>> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~
>> RBHS Campus
>> || \\ of NJ | Office of Advanced Research Computing - MSB
>> A555B, Newark
>> `'
>>
>>> On Sep 28, 2023, at 13:48, Paul Edmon <pedmon at cfa.harvard.edu> wrote:
>>>
>>> Slurm should take care of it when you add it.
>>>
>>> So far as horror stories, under previous versions our database size
>>> ballooned to be so massive that it actually prevented us from
>>> upgrading and we had to drop the columns containing the job_script
>>> and job_env. This was back before slurm started hashing the scripts
>>> so that it would only store one copy of duplicate scripts. After
>>> this point we found that the job_script database stayed at a fairly
>>> reasonable size as most users use functionally the same script each
>>> time. However the job_env continued to grow like crazy as there are
>>> variables in our environment that change fairly consistently
>>> depending on where the user is. Thus job_envs ended up being too
>>> massive to keep around and so we had to drop them. Frankly we never
>>> really used them for debugging. The job_scripts though are super
>>> useful and not that much overhead.
>>>
>>> In summary my recommendation is to only store job_scripts. job_envs
>>> add too much storage for little gain, unless your job_envs are
>>> basically the same for each user in each location.
>>>
>>> Also it should be noted that there is no way to prune out
>>> job_scripts or job_envs right now. So the only way to get rid of
>>> them if they get large is to 0 out the column in the table. You can
>>> ask SchedMD for the mysql command to do this as we had to do it here
>>> to our job_envs.
>>>
>>> -Paul Edmon-
>>>
>>> On 9/28/2023 1:40 PM, Davide DelVento wrote:
>>>> In my current slurm installation, (recently upgraded to slurm
>>>> v23.02.3), I only have
>>>>
>>>> AccountingStoreFlags=job_comment
>>>>
>>>> I now intend to add both
>>>>
>>>> AccountingStoreFlags=job_script
>>>> AccountingStoreFlags=job_env
>>>>
>>>> leaving the default 4MB value for max_script_size
>>>>
>>>> Do I need to do anything on the DB myself, or will slurm take care
>>>> of the additional tables if needed?
>>>>
>>>> Any comments/suggestions/gotcha/pitfalls/horror_stories to share? I
>>>> know about the additional diskspace and potentially load needed,
>>>> and with our resources and typical workload I should be okay with that.
>>>>
>>>> Thanks!
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230928/18c78b15/attachment.htm>
More information about the slurm-users
mailing list