[slurm-users] [External] Re: Upgrading SLURM from 17.02.7 to 18.08.8 - Job ID gets reset
Florian Zillner
fzillner at lenovo.com
Fri Oct 18 12:53:35 UTC 2019
Hi Lech,
Thanks for the hint. I didn't know about that option.
Another way would be to just retain the StateSaveLocation files and move those over to the sandbox in which I've tested the upgrade. Once I copied the files and re-did the upgrade from scratch, the IDs were consecutive as expected. :)
Thanks,
Florian
-----Original Message-----
From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Lech Nieroda
Sent: Freitag, 18. Oktober 2019 12:18
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: [External] Re: [slurm-users] Upgrading SLURM from 17.02.7 to 18.08.8 - Job ID gets reset
Hi Florian,
You can use the FirstJobId option from slurm.conf to continue the JobIds seamlessly.
Kind Regards,
Lech
> Am 18.10.2019 um 11:47 schrieb Florian Zillner <fzillner at lenovo.com>:
>
> Hi all,
>
> we’re using OpenHPC packages to run SLURM. Current OpenHPC Version is 1.3.8 (SLURM 18.08.8), though we’re still at 1.3.3 (SLURM 17.02.7), for now.
>
> I’ve successfully attempted an upgrade in a separate testing environment, which works fine once you adhere to the upgrading notes… So the upgrade itself is not the issue here.
>
> However, I do see that the SLURM Job ID gets reset to 1, instead of continuing as sequential number, whereas the job_db_inx is incremented as before. This is visible for example when looking at the job queue. From a database perspective this looks like this:
> MariaDB [slurm_acct_db]> select job_db_inx,id_job,pack_job_id,job_name from clustername_job_table limit 96070,96100;
> +------------+--------+-------------+--------------+
> | job_db_inx | id_job | pack_job_id | job_name |
> +------------+--------+-------------+--------------+
> | 107116 | 96155 | 0 | bt |
> | 107118 | 96156 | 0 | bt |
> | 107119 | 96157 | 0 | bt |
> | 107120 | 96158 | 0 | cs_01 |
> | 107121 | 96159 | 0 | cs_01 |
> | 107123 | 96160 | 0 | cs_01 |
> | 107124 | 96161 | 0 | cs_01 |
> | 107125 | 96162 | 0 | cs_01 |
> | 107126 | 96163 | 0 | cs_01 |
> | 107127 | 96164 | 0 | cs_01 | <--- Last Job old version
> | 107128 | 2 | 0 | hostname | <--- Jobs after upgrade
> | 107130 | 3 | 0 | hostname |
> | 107131 | 4 | 0 | hostname |
> | 107133 | 5 | 0 | hostname |
> | 107135 | 6 | 0 | hostname |
> | 107137 | 7 | 0 | hostname |
> | 107138 | 8 | 0 | hostname |
> | 107140 | 9 | 0 | hostname |
> | 107142 | 10 | 0 | hostname |
> | 107144 | 11 | 0 | test |
> | 107145 | 12 | 0 | test |
> | 107146 | 13 | 0 | test |
> | 107147 | 14 | 0 | test |
> | 107148 | 15 | 0 | test |
> | 107149 | 16 | 0 | testzilloooo |
> | 107150 | 17 | 0 | testzilloooo |
> | 107151 | 18 | 0 | testzilloooo |
> | 107152 | 19 | 0 | testzilloooo |
> | 107153 | 20 | 0 | testzilloooo |
> | 107154 | 21 | 0 | testzilloooo |
> +------------+--------+-------------+--------------+
> 30 rows in set (0.134 sec)
>
> Question: is there a way to a) either let SLURM continue the job IDs as usual, or b) set any arbitrary number? If this is a known thing I failed to find it.
>
> Thx!
> Florian
More information about the slurm-users
mailing list