[slurm-users] slurmdbd upgrade startup error

Tina Fora tfora at riseup.net
Tue Aug 14 13:43:39 MDT 2018


Hi Ole,

I'm testing the upgrade on a test cluster. Two of them actually, one with
exact same OS using the same mysql server and the other on updated OS with
local mysql installation. I also ran the mysql_upgrade command you
mentioned on the local installation.

My guess is that there is something in the database that slurmdbd does not
like. I'm not sure how to debug it further.

Thanks,
T




> Hi Tina,
>
> Is it the same OS version for 17.02 and 17.11, or are you upgrading the
> OS (and possibly the MySQL/MariaDB) at the same time?  I assume you're
> testing the Slurm upgrade on a test server and not the production cluster?
>
> Did you check the steps mentioned in the thread "slurmdbd:
> mysql/accounting errors on 17.11.5 upgrade" which you initiated on May 7?
>
> /Ole
>
>
> On 14-08-2018 17:53, Tina Fora wrote:
>> I compiled slurm from standard rpmbuild. Upgrading from 17.02 to
>> 17.11.9-2
>> is giving the error below. I'm not sure what the issue is with
>> accounting
>> storage plugin because it seems to load it ok. On the mysql failed query
>> I
>> tried to run it manually and it returns sql syntax error (full error
>> below). Works fine on empty database but I need to upgrade our existing.
>> How can I debug this further to see what the issue is?
>>
>>
>> # slurmdbd -V
>> slurm 17.11.9-2
>> #
>> # slurmdbd -D -vvvv
>> slurmdbd: debug:  Log file re-opened
>> slurmdbd: debug3: Trying to load plugin /usr/lib64/slurm/auth_munge.so
>> slurmdbd: debug:  Munge authentication plugin loaded
>> slurmdbd: debug3: Success.
>> slurmdbd: debug3: Trying to load plugin
>> /usr/lib64/slurm/accounting_storage_mysql.so
>> slurmdbd: debug2: mysql_connect() called for db slurmdb
>> slurmdbd: pre-converting job table for cluster
>> slurmdbd: adding column pack_job_id after id_group in table
>> "cluster_job_table"
>> slurmdbd: adding column pack_job_offset after pack_job_id in table
>> "cluster_job_table"
>> slurmdbd: adding column mcs_label after kill_requid in table
>> "cluster_job_table"
>> slurmdbd: adding column work_dir after wckey in table
>> "cluster_job_table"
>> slurmdbd: adding key old_tuple (id_job, id_assoc, time_submit) to table
>> "cluster_job_table"
>> slurmdbd: adding key pack_job (pack_job_id) to table "cluster_job_table"
>> slurmdbd: debug:  Table "cluster_job_table" has changed.  Updating...
>> slurmdbd: error: mysql_query failed: 1062 Duplicate entry
>> '1042-1012321342' for key 'id_job'
>> alter table "cluster_job_table" modify `job_db_inx` bigint unsigned not
>> null auto_increment, modify `mod_time` bigint unsigned default 0 not
>> null,
>> modify `deleted` tinyint default 0 not null, modify `account` tinytext,
>> modify `admin_comment` text, modify `array_task_str` text, modify
>> `array_max_tasks` int unsigned default 0 not null, modify
>> `array_task_pending` int unsigned default 0 not null, modify `cpus_req`
>> int unsigned not null, modify `derived_ec` int unsigned default 0 not
>> null, modify `derived_es` text, modify `exit_code` int unsigned default
>> 0
>> not null, modify `job_name` tinytext not null, modify `id_assoc` int
>> unsigned not null, modify `id_array_job` int unsigned default 0 not
>> null,
>> modify `id_array_task` int unsigned default 0xfffffffe not null, modify
>> `id_block` tinytext, modify `id_job` int unsigned not null, modify
>> `id_qos` int unsigned default 0 not null, modify `id_resv` int unsigned
>> not null, modify `id_wckey` int unsigned not null, modify `id_user` int
>> unsigned not null, modify `id_group` int unsigned not null, add
>> `pack_job_id` int unsigned not null after id_group, add
>> `pack_job_offset`
>> int unsigned not null after pack_job_id, modify `kill_requid` int
>> default
>> -1 not null, add `mcs_label` tinytext default '' after kill_requid,
>> modify
>> `mem_req` bigint unsigned default 0 not null, modify `nodelist` text,
>> modify `nodes_alloc` int unsigned not null, modify `node_inx` text,
>> modify
>> `partition` tinytext not null, modify `priority` int unsigned not null,
>> modify `state` int unsigned not null, modify `timelimit` int unsigned
>> default 0 not null, modify `time_submit` bigint unsigned default 0 not
>> null, modify `time_eligible` bigint unsigned default 0 not null, modify
>> `time_start` bigint unsigned default 0 not null, modify `time_end`
>> bigint
>> unsigned default 0 not null, modify `time_suspended` bigint unsigned
>> default 0 not null, modify `gres_req` text not null default '', modify
>> `gres_alloc` text not null default '', modify `gres_used` text not null
>> default '', modify `wckey` tinytext not null default '', add `work_dir`
>> text not null default '' after wckey, modify `track_steps` tinyint not
>> null, modify `tres_alloc` text not null default '', modify `tres_req`
>> text
>> not null default '', drop primary key, add primary key (job_db_inx),
>> drop
>> index id_job, add unique index (id_job, time_submit), add key old_tuple
>> (id_job, id_assoc, time_submit), drop key rollup, add key rollup
>> (time_eligible, time_end), drop key rollup2, add key rollup2 (time_end,
>> time_eligible), drop key nodes_alloc, add key nodes_alloc (nodes_alloc),
>> drop key wckey, add key wckey (id_wckey), drop key qos, add key qos
>> (id_qos), drop key association, add key association (id_assoc), drop key
>> array_job, add key array_job (id_array_job), add key pack_job
>> (pack_job_id), drop key reserv, add key reserv (id_resv), drop key
>> sacct_def, add key sacct_def (id_user, time_start, time_end), drop key
>> sacct_def2, add key sacct_def2 (id_user, time_end, time_eligible);
>> slurmdbd: Accounting storage MYSQL plugin failed
>> slurmdbd: error: Couldn't load specified plugin name for
>> accounting_storage/mysql: Plugin init() callback failed
>> slurmdbd: error: cannot create accounting_storage context for
>> accounting_storage/mysql
>> slurmdbd: fatal: Unable to initialize accounting_storage/mysql
>> accounting
>> storage plugin
>>
>>
>> mysql error:
>> ERROR 1064 (42000): You have an error in your SQL syntax; check the
>> manual
>> that corresponds to your MariaDB server version for the right syntax to
>> use near '"cluster_job_table" modify `job_db_inx` bigint unsigned not
>> null
>> auto_increment,' at line 1
>
>





More information about the slurm-users mailing list