[slurm-users] Solved: Error upgrading slurmdbd from 19.05 to 20.02

Steininger, Herbert herbert_steininger at psych.mpg.de
Mon Mar 16 08:35:54 UTC 2020


Hi,

just want to let you know that i solved the problem simply by renaming the columns back to 'pack_...' and started slurmdbd again, which renamed them to 'het_...'
slurmdbd is running again.

Best,
Herbert


-----Ursprüngliche Nachricht-----
Von: slurm-users [mailto:slurm-users-bounces at lists.schedmd.com] Im Auftrag von Steininger, Herbert
Gesendet: Freitag, 13. März 2020 11:49
An: Slurm User Community List <slurm-users at lists.schedmd.com>
Betreff: Re: [slurm-users] Error upgrading slurmdbd from 19.05 to 20.02

Hi,

i guess i found the Problem.

It seems to come from this file:
src/plugins/accounting_storage/mysql/as_mysql_convert.c
in particular from here:

--- code ---
static int _convert_job_table_pre(mysql_conn_t *mysql_conn, char *cluster_name)
{
        int rc = SLURM_SUCCESS;
        char *query = NULL;

        if (db_curr_ver < 8) {
                /*
                 * Change the names pack_job_id and pack_job_offset to be het_*
                 */
                query = xstrdup_printf(
                        "alter table \"%s_%s\" "
                        "change pack_job_id het_job_id int unsigned not null, "
                        "change pack_job_offset het_job_offset "
                        "int unsigned not null;",
                        cluster_name, job_table);
        }

        if (query) {
                if (debug_flags & DEBUG_FLAG_DB_QUERY)
                        DB_DEBUG(mysql_conn->conn, "query\n%s", query);

                rc = mysql_db_query(mysql_conn, query);
                xfree(query);
                if (rc != SLURM_SUCCESS)
                        error("%s: Can't convert %s_%s info: %m",
                              __func__, cluster_name, job_table);
        }

        return rc;
}
--- code ---

it checks if version is below "8" and if it is so, rename the tables.

In the Table the Version is "7"

--- mysql ---
MariaDB [slurm_acct_db]> select * from convert_version_table;
+------------+---------+
| mod_time   | version |
+------------+---------+
| 1579853103 |       7 |
+------------+---------+
1 row in set (0.00 sec)
--- mysql ---


But in my Table, I already have the right columns:

--- table ---
MariaDB [slurm_acct_db]> show columns from `mpip-cluster_job_table`;
+--------------------+---------------------+------+-----+------------+----------------+
| Field              | Type                | Null | Key | Default    | Extra          |
+--------------------+---------------------+------+-----+------------+----------------+
| job_db_inx         | bigint(20) unsigned | NO   | PRI | NULL       | auto_increment |
| mod_time           | bigint(20) unsigned | NO   |     | 0          |                |
| deleted            | tinyint(4)          | NO   |     | 0          |                |
| account            | tinytext            | YES  |     | NULL       |                |
| admin_comment      | text                | YES  |     | NULL       |                |
| array_task_str     | text                | YES  |     | NULL       |                |
| array_max_tasks    | int(10) unsigned    | NO   |     | 0          |                |
| array_task_pending | int(10) unsigned    | NO   |     | 0          |                |
| constraints        | text                | YES  |     | NULL       |                |
| cpus_req           | int(10) unsigned    | NO   |     | NULL       |                |
| derived_ec         | int(10) unsigned    | NO   |     | 0          |                |
| derived_es         | text                | YES  |     | NULL       |                |
| exit_code          | int(10) unsigned    | NO   |     | 0          |                |
| flags              | int(10) unsigned    | NO   |     | 0          |                |
| job_name           | tinytext            | NO   |     | NULL       |                |
| id_assoc           | int(10) unsigned    | NO   | MUL | NULL       |                |
| id_array_job       | int(10) unsigned    | NO   | MUL | 0          |                |
| id_array_task      | int(10) unsigned    | NO   |     | 4294967294 |                |
| id_block           | tinytext            | YES  |     | NULL       |                |
| id_job             | int(10) unsigned    | NO   | MUL | NULL       |                |
| id_qos             | int(10) unsigned    | NO   | MUL | 0          |                |
| id_resv            | int(10) unsigned    | NO   | MUL | NULL       |                |
| id_wckey           | int(10) unsigned    | NO   | MUL | NULL       |                |
| id_user            | int(10) unsigned    | NO   | MUL | NULL       |                |
| id_group           | int(10) unsigned    | NO   |     | NULL       |                |
| het_job_id         | int(10) unsigned    | NO   | MUL | NULL       |                |
| het_job_offset     | int(10) unsigned    | NO   |     | NULL       |                |
| kill_requid        | int(11)             | NO   |     | -1         |                |
| state_reason_prev  | int(10) unsigned    | NO   |     | NULL       |                |
| mcs_label          | tinytext            | YES  |     | NULL       |                |
| mem_req            | bigint(20) unsigned | NO   |     | 0          |                |
| nodelist           | text                | YES  |     | NULL       |                |
| nodes_alloc        | int(10) unsigned    | NO   | MUL | NULL       |                |
| node_inx           | text                | YES  |     | NULL       |                |
| partition          | tinytext            | NO   |     | NULL       |                |
| priority           | int(10) unsigned    | NO   |     | NULL       |                |
| state              | int(10) unsigned    | NO   |     | NULL       |                |
| timelimit          | int(10) unsigned    | NO   |     | 0          |                |
| time_submit        | bigint(20) unsigned | NO   |     | 0          |                |
| time_eligible      | bigint(20) unsigned | NO   | MUL | 0          |                |
| time_start         | bigint(20) unsigned | NO   |     | 0          |                |
| time_end           | bigint(20) unsigned | NO   | MUL | 0          |                |
| time_suspended     | bigint(20) unsigned | NO   |     | 0          |                |
| gres_req           | text                | NO   |     | NULL       |                |
| gres_alloc         | text                | NO   |     | NULL       |                |
| gres_used          | text                | NO   |     | NULL       |                |
| wckey              | tinytext            | NO   |     | NULL       |                |
| work_dir           | text                | NO   |     | NULL       |                |
| system_comment     | text                | YES  |     | NULL       |                |
| track_steps        | tinyint(4)          | NO   |     | NULL       |                |
| tres_alloc         | text                | NO   |     | NULL       |                |
| tres_req           | text                | NO   |     | NULL       |                |
+--------------------+---------------------+------+-----+------------+----------------+
52 rows in set (0.00 sec)

MariaDB [slurm_acct_db]>

--- table ---

somebody knows what could be done to get slurmdbd up?
is there an option to prevent this upgrade to mysql?

Would I have to rebuild slurm?


Thanks in Advance,
Herbert


-----Ursprüngliche Nachricht-----
Von: slurm-users [mailto:slurm-users-bounces at lists.schedmd.com] Im Auftrag von Steininger, Herbert
Gesendet: Donnerstag, 12. März 2020 16:01
An: slurm-users at lists.schedmd.com
Betreff: [slurm-users] Error upgrading slurmdbd from 19.05 to 20.02

Hello,

while upgrading slurm from 19.05 to 20.02 an error occurred while trying to upgrade slurmdbd first.

The error is:
slurmdbd: debug:  Munge authentication plugin loaded
slurmdbd: debug2: mysql_connect() called for db slurm_acct_db
slurmdbd: debug2: Attempting to connect to slurmmaster:3306
slurmdbd: debug2: innodb_buffer_pool_size: 134217728
slurmdbd: debug2: innodb_log_file_size: 5242880
slurmdbd: debug2: innodb_lock_wait_timeout: 50
slurmdbd: error: Database settings not recommended values: innodb_buffer_pool_size innodb_log_file_size innodb_lock_wait_timeout
slurmdbd: pre-converting job table for mpip-cluster
slurmdbd: error: mysql_query failed: 1054 Unknown column 'pack_job_id' in 'mpip-cluster_job_table'
alter table "mpip-cluster_job_table" change pack_job_id het_job_id int unsigned not null, change pack_job_offset het_job_offset int unsigned not null;
slurmdbd: error: _convert_job_table_pre: Can't convert mpip-cluster_job_table info: Unknown error 1054
slurmdbd: error: issue converting tables before create
slurmdbd: Accounting storage MYSQL plugin failed
slurmdbd: error: Couldn't load specified plugin name for accounting_storage/mysql: Plugin init() callback failed
slurmdbd: error: cannot create accounting_storage context for accounting_storage/mysql
slurmdbd: fatal: Unable to initialize accounting_storage/mysql accounting storage plugin

How to get the missing columns?

Thanks in Advance,
Herbert






More information about the slurm-users mailing list