[slurm-users] [External] Re: Upgrading SLURM from 17.02.7 to 18.08.8 - Job ID gets reset

Florian Zillner fzillner at lenovo.com
Fri Oct 18 12:53:35 UTC 2019


Hi Lech,

Thanks for the hint. I didn't know about that option.

Another way would be to just retain the StateSaveLocation files and move those over to the sandbox in which I've tested the upgrade. Once I copied the files and re-did the upgrade from scratch, the IDs were consecutive as expected. :)

Thanks,
Florian



-----Original Message-----
From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Lech Nieroda
Sent: Freitag, 18. Oktober 2019 12:18
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: [External] Re: [slurm-users] Upgrading SLURM from 17.02.7 to 18.08.8 - Job ID gets reset

Hi Florian,

You can use the FirstJobId option from slurm.conf to continue the JobIds seamlessly.

Kind Regards,
Lech

> Am 18.10.2019 um 11:47 schrieb Florian Zillner <fzillner at lenovo.com>:
> 
> Hi all,
>  
> we’re using OpenHPC packages to run SLURM. Current OpenHPC Version is 1.3.8 (SLURM 18.08.8), though we’re still at 1.3.3 (SLURM 17.02.7), for now.
>  
> I’ve successfully attempted an upgrade in a separate testing environment, which works fine once you adhere to the upgrading notes… So the upgrade itself is not the issue here.
>  
> However, I do see that the SLURM Job ID gets reset to 1, instead of continuing as sequential number, whereas the job_db_inx is incremented as before. This is visible for example when looking at the job queue. From a database perspective this looks like this:
> MariaDB [slurm_acct_db]> select job_db_inx,id_job,pack_job_id,job_name from clustername_job_table limit 96070,96100;
> +------------+--------+-------------+--------------+
> | job_db_inx | id_job | pack_job_id | job_name     |
> +------------+--------+-------------+--------------+
> |     107116 |  96155 |           0 | bt           |
> |     107118 |  96156 |           0 | bt           |
> |     107119 |  96157 |           0 | bt           |
> |     107120 |  96158 |           0 | cs_01        |
> |     107121 |  96159 |           0 | cs_01        |
> |     107123 |  96160 |           0 | cs_01        |
> |     107124 |  96161 |           0 | cs_01        |
> |     107125 |  96162 |           0 | cs_01        |
> |     107126 |  96163 |           0 | cs_01        |
> |     107127 |  96164 |           0 | cs_01        | <--- Last Job old version
> |     107128 |      2 |           0 | hostname     | <--- Jobs after upgrade
> |     107130 |      3 |           0 | hostname     |
> |     107131 |      4 |           0 | hostname     |
> |     107133 |      5 |           0 | hostname     |
> |     107135 |      6 |           0 | hostname     |
> |     107137 |      7 |           0 | hostname     |
> |     107138 |      8 |           0 | hostname     |
> |     107140 |      9 |           0 | hostname     |
> |     107142 |     10 |           0 | hostname     |
> |     107144 |     11 |           0 | test         |
> |     107145 |     12 |           0 | test         |
> |     107146 |     13 |           0 | test         |
> |     107147 |     14 |           0 | test         |
> |     107148 |     15 |           0 | test         |
> |     107149 |     16 |           0 | testzilloooo |
> |     107150 |     17 |           0 | testzilloooo |
> |     107151 |     18 |           0 | testzilloooo |
> |     107152 |     19 |           0 | testzilloooo |
> |     107153 |     20 |           0 | testzilloooo |
> |     107154 |     21 |           0 | testzilloooo |
> +------------+--------+-------------+--------------+
> 30 rows in set (0.134 sec)
>  
> Question: is there a way to a) either let SLURM continue the job IDs as usual, or b) set any arbitrary number? If this is a known thing I failed to find it.
>  
> Thx!
> Florian




More information about the slurm-users mailing list