Purging of job_script and job_env from the database in 26.05
If you are using AccountingStoreFlags=job_script and/or job_env in slurm.conf to store job scripts in the Slurm database, you may be interested in a highly useful feature introduced in Slurm 26.05: Previously, the stored job_scripts and job_env entries would accumulate without bounds in the database, potentially causing database performance or size issues, especially while upgrading Slurm. With 26.05 there are some new slurmdbd.conf options which can be used to purge job_script or job_env entries in the database, see ticket_23818 [1]. This new feature isn't mentioned at all in the Release notes [4], and is only inconspicuously mentioned way down in the Changes section [3] as:
Adding new archive/purge options to allow for explicit archiving of job_scripts and job_env without jobs.
The safe way to enable the new purge parameters is to introduce them very gradually, for example: PurgeJobScriptAfter=2000days PurgeJobEnvAfter=2000days and lower the values little by little over time (restarting slurmdbd each time). The MaxPurgeLimit [5] parameter may alternatively be used to limit the amount of database work. Best regards, Ole [1] https://support.schedmd.com/show_bug.cgi?id=23818 [2] https://slurm.schedmd.com/slurm.conf.html#OPT_AccountingStoreFlags [3] https://github.com/SchedMD/slurm/releases [4] https://slurm.schedmd.com/release_notes.html [5] https://slurm.schedmd.com/slurmdbd.conf.html#OPT_MaxPurgeLimit -- Ole Holm Nielsen PhD, Senior HPC Officer Department of Physics, Technical University of Denmark
On Wednesday, May 27th, 2026 at 17:53, Ole Holm Nielsen via slurm-users <slurm-users@lists.schedmd.com> wrote:
The safe way to enable the new purge parameters is to introduce them very gradually, for example:
PurgeJobScriptAfter=2000days PurgeJobEnvAfter=2000days
and lower the values little by little over time (restarting slurmdbd each time).
Do you mean to start off with a value of 2000 days? Some sites (although perhaps not mine!) could have completely refreshed the hardware resourses that Slurm is controlling, within a 5 year period. Kevin Buckley
On 5/29/26 09:40, Kevin Buckley (Pawsey) via slurm-users wrote:
On Wednesday, May 27th, 2026 at 17:53, Ole Holm Nielsen via slurm-users <slurm-users@lists.schedmd.com> wrote:
The safe way to enable the new purge parameters is to introduce them very gradually, for example:
PurgeJobScriptAfter=2000days PurgeJobEnvAfter=2000days
and lower the values little by little over time (restarting slurmdbd each time).
Do you mean to start off with a value of 2000 days?
Yes, 2000 days purging interval would do no harm and ensure that you don't overload the slurmdbd server with database operations, so it's good for an initial test where you check the slurmdbd performance over a couple of days where daily purging takes place. I have an extended discussion of Slurm database purging in this Wiki page: https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_database/#setting-database-p...
Some sites (although perhaps not mine!) could have completely refreshed the hardware resourses that Slurm is controlling, within a 5 year period.
The compute node hardware is a separate resource from the Slurm database server. The age of the Slurm database contents is what matters when it comes to purging of records. Would you agree on these considerations? Best regards, Ole
On Friday, May 29th, 2026 at 17:00, Ole Holm Nielsen via slurm-users <slurm-users@lists.schedmd.com> wrote:
On 5/29/26 09:40, Kevin Buckley (Pawsey) via slurm-users wrote:
On Wednesday, May 27th, 2026 at 17:53, Ole Holm Nielsen via slurm-users <slurm-users@lists.schedmd.com> wrote:
The safe way to enable the new purge parameters is to introduce them very gradually, for example:
PurgeJobScriptAfter=2000days PurgeJobEnvAfter=2000days
and lower the values little by little over time (restarting slurmdbd each time).
Do you mean to start off with a value of 2000 days?
Yes, 2000 days purging interval would do no harm and ensure that you don't overload the slurmdbd server with database operations, so it's good for an initial test where you check the slurmdbd performance over a couple of days where daily purging takes place.
I have an extended discussion of Slurm database purging in this Wiki page: https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_database/#setting-database-p... ...
Would you agree on these considerations?
So there's an assumption there that one is starting with a database that has entries that are over 2000 days old? In which case, given that context, yes. Regards, Kevin Buckley
participants (2)
-
Kevin Buckley (Pawsey) -
Ole Holm Nielsen