[slurm-users] enabling job script archival

Paul Edmon pedmon at cfa.harvard.edu
Thu Sep 28 18:03:27 UTC 2023


Yes it was later than that. If you are 23.02 you are good.  We've been 
running with storing job_scripts on for years at this point and that 
part of the database only uses up 8.4G.  Our entire database takes up 
29G on disk. So its about 1/3 of the database. We also have database 
compression which helps with the on disk size. Raw uncompressed our 
database is about 90G.  We keep 6 months of data in our active database.

-Paul Edmon-

On 9/28/2023 1:57 PM, Ryan Novosielski wrote:
> Sorry for the duplicate e-mail in a short time: do you know (or 
> anyone) when the hashing was added? Was planning to enable this on 
> 21.08, but we then had to delay our upgrade to it. I’m assuming later 
> than that, as I believe that’s when the feature was added.
>
>> On Sep 28, 2023, at 13:55, Ryan Novosielski <novosirj at rutgers.edu> wrote:
>>
>> Thank you; we’ll put in a feature request for improvements in that 
>> area, and also thanks for the warning? I thought of that in passing, 
>> but the real world experience is really useful. I could easily see 
>> wanting that stuff to be retained less often than the main records, 
>> which is what I’d ask for.
>>
>> I assume that archiving, in general, would also remove this stuff, 
>> since old jobs themselves will be removed?
>>
>> --
>> #BlackLivesMatter
>> ____
>> || \\UTGERS, |---------------------------*O*---------------------------
>> ||_// the State |         Ryan Novosielski - novosirj at rutgers.edu
>> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ 
>> RBHS Campus
>> ||  \\    of NJ | Office of Advanced Research Computing - MSB 
>> A555B, Newark
>>      `'
>>
>>> On Sep 28, 2023, at 13:48, Paul Edmon <pedmon at cfa.harvard.edu> wrote:
>>>
>>> Slurm should take care of it when you add it.
>>>
>>> So far as horror stories, under previous versions our database size 
>>> ballooned to be so massive that it actually prevented us from 
>>> upgrading and we had to drop the columns containing the job_script 
>>> and job_env.  This was back before slurm started hashing the scripts 
>>> so that it would only store one copy of duplicate scripts.  After 
>>> this point we found that the job_script database stayed at a fairly 
>>> reasonable size as most users use functionally the same script each 
>>> time. However the job_env continued to grow like crazy as there are 
>>> variables in our environment that change fairly consistently 
>>> depending on where the user is. Thus job_envs ended up being too 
>>> massive to keep around and so we had to drop them. Frankly we never 
>>> really used them for debugging. The job_scripts though are super 
>>> useful and not that much overhead.
>>>
>>> In summary my recommendation is to only store job_scripts. job_envs 
>>> add too much storage for little gain, unless your job_envs are 
>>> basically the same for each user in each location.
>>>
>>> Also it should be noted that there is no way to prune out 
>>> job_scripts or job_envs right now. So the only way to get rid of 
>>> them if they get large is to 0 out the column in the table. You can 
>>> ask SchedMD for the mysql command to do this as we had to do it here 
>>> to our job_envs.
>>>
>>> -Paul Edmon-
>>>
>>> On 9/28/2023 1:40 PM, Davide DelVento wrote:
>>>> In my current slurm installation, (recently upgraded to slurm 
>>>> v23.02.3), I only have
>>>>
>>>> AccountingStoreFlags=job_comment
>>>>
>>>> I now intend to add both
>>>>
>>>> AccountingStoreFlags=job_script
>>>> AccountingStoreFlags=job_env
>>>>
>>>> leaving the default 4MB value for max_script_size
>>>>
>>>> Do I need to do anything on the DB myself, or will slurm take care 
>>>> of the additional tables if needed?
>>>>
>>>> Any comments/suggestions/gotcha/pitfalls/horror_stories to share? I 
>>>> know about the additional diskspace and potentially load needed, 
>>>> and with our resources and typical workload I should be okay with that.
>>>>
>>>> Thanks!
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230928/18c78b15/attachment.htm>


More information about the slurm-users mailing list