[slurm-users] Upgrading SLURM from 17.02.7 to 18.08.8 - Job ID gets reset

Florian Zillner fzillner at lenovo.com
Fri Oct 18 09:47:45 UTC 2019


Hi all,

we're using OpenHPC packages to run SLURM. Current OpenHPC Version is 1.3.8 (SLURM 18.08.8), though we're still at 1.3.3 (SLURM 17.02.7), for now.

I've successfully attempted an upgrade in a separate testing environment, which works fine once you adhere to the upgrading notes... So the upgrade itself is not the issue here.

However, I do see that the SLURM Job ID gets reset to 1, instead of continuing as sequential number, whereas the job_db_inx is incremented as before. This is visible for example when looking at the job queue. From a database perspective this looks like this:
MariaDB [slurm_acct_db]> select job_db_inx,id_job,pack_job_id,job_name from clustername_job_table limit 96070,96100;
+------------+--------+-------------+--------------+
| job_db_inx | id_job | pack_job_id | job_name     |
+------------+--------+-------------+--------------+
|     107116 |  96155 |           0 | bt           |
|     107118 |  96156 |           0 | bt           |
|     107119 |  96157 |           0 | bt           |
|     107120 |  96158 |           0 | cs_01        |
|     107121 |  96159 |           0 | cs_01        |
|     107123 |  96160 |           0 | cs_01        |
|     107124 |  96161 |           0 | cs_01        |
|     107125 |  96162 |           0 | cs_01        |
|     107126 |  96163 |           0 | cs_01        |
|     107127 |  96164 |           0 | cs_01        | <--- Last Job old version
|     107128 |      2 |           0 | hostname     | <--- Jobs after upgrade
|     107130 |      3 |           0 | hostname     |
|     107131 |      4 |           0 | hostname     |
|     107133 |      5 |           0 | hostname     |
|     107135 |      6 |           0 | hostname     |
|     107137 |      7 |           0 | hostname     |
|     107138 |      8 |           0 | hostname     |
|     107140 |      9 |           0 | hostname     |
|     107142 |     10 |           0 | hostname     |
|     107144 |     11 |           0 | test         |
|     107145 |     12 |           0 | test         |
|     107146 |     13 |           0 | test         |
|     107147 |     14 |           0 | test         |
|     107148 |     15 |           0 | test         |
|     107149 |     16 |           0 | testzilloooo |
|     107150 |     17 |           0 | testzilloooo |
|     107151 |     18 |           0 | testzilloooo |
|     107152 |     19 |           0 | testzilloooo |
|     107153 |     20 |           0 | testzilloooo |
|     107154 |     21 |           0 | testzilloooo |
+------------+--------+-------------+--------------+
30 rows in set (0.134 sec)

Question: is there a way to a) either let SLURM continue the job IDs as usual, or b) set any arbitrary number? If this is a known thing I failed to find it.

Thx!
Florian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20191018/b1cff0b9/attachment-0001.htm>


More information about the slurm-users mailing list