[slurm-users] Slurm Upgrade from 17.02

Steven Senator (slurm-dev-list) slurm-dev-list at senator.net
Thu Feb 20 17:50:21 UTC 2020


When upgrading to 18.08 it is prudent to add following lines into your
/etc/my.cnf as per
  https://slurm.schedmd.com/accounting.html
  https://slurm.schedmd.com/SLUG19/High_Throughput_Computing.pdf (slide #6)

[mysqld]
innodb_buffer_pool_size=1G
innodb_log_file_size=64M
innodb_lock_wait_timeout=900

If the node on which mysql is running has sufficient memory you may
want to increase the innodb_buffer_pool_size beyond 1G. That's just
the minimum threshold below which slurm complains. We use 8G, for
example, because it fits our churn rate for {job arrival, job dispatch
to run state} in RAM and our nodes enough RAM to accommodate an 8G
cache. (references on tuning below)

When you reset this, you will also need to remove the previous innodb
caches, which are probably in /var/lib/mysql. When we did this we
removed and recreated the slurm_acct_db, although that was partially
motivated by the fact that this coincided with an OS and database
patch upgrade and a major accounting and allocation cycle.
  0. Stop slurmctld, slurmdbd.
  1. Create a dump of your database. (mysqldump ...)
  2. Verify that the dump is complete and valid.
  3. Remove the slurm_acct_db. (mysql -e "drop database slurm_acct_db;")
  3. Stop your mysql instance cleanly.
  4. Check the logs. Verify that the mysql instance was stopped cleanly.
  5.  rm /var/lib/mysql/ib_logfile? /var/lib/ibdata1
  6. Put the new lines as above into /etc/my.cnf with the log file
sized appropriately.
  7. Start mysql.
  8. Verify it started cleanly.
  9. Restart the slurm dbd manually, possibly in non-daemon mode.
(slurmdbd -D -vv)
  10. sacctmgr create cluster <your-cluster>

If you want to restore the data back into the data base, do it
*before* step 9 so that the schema conversion can be performed. I like
using mutiple "-vv" so that I can see some of the messages as that
conversion process proceeds.

Some references on mysql innodb_buffer_pool_size tuning:
  https://scalegrid.io/blog/calculating-innodb-buffer-pool-size-for-your-mysql-server/
  https://mariadb.com/kb/en/innodb-system-variables/#innodb_buffer_pool_size
  https://mariadb.com/kb/en/innodb-buffer-pool/
  https://www.percona.com/blog/2015/06/02/80-ram-tune-innodb_buffer_pool_size/
  https://dev.mysql.com/doc/refman/5.7/en/innodb-buffer-pool-resize.html

Hope this helps,
 -Steve Senator

On Wed, Feb 19, 2020 at 7:12 AM Ricardo Gregorio
<ricardo.gregorio at rothamsted.ac.uk> wrote:
>
> hi all,
>
>
>
> I am putting together an upgrade plan for slurm on our HPC. We are currently running old version 17.02.11. Would you guys advise us upgrading to 18.08 or 19.05?
>
>
>
> I understand we will have to also upgrade the version of mariadb from 5.5 to 10.X and pay attention to 'long db upgrade from 17.02 to 18.X or 19.X' and 'bug 6796' amongst other things.
>
>
>
> We would appreciate your comments/recommendations
>
>
>
> Regards,
>
> Ricardo Gregorio
>
> Research and Systems Administrator
>
> Operations ITS
>
>
>
>
>
>
> Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.



More information about the slurm-users mailing list