[slurm-users] cluster not registered
Buckley, Ronan
Ronan.Buckley at Dell.com
Tue Jun 5 08:16:20 MDT 2018
Hi All,
Commands like sacct and sreport provide blank information:
# sreport cluster utilization
--------------------------------------------------------------------------------
Cluster Utilization 2018-06-04T00:00:00 - 2018-06-04T23:59:59
Use reported in TRES Minutes
--------------------------------------------------------------------------------
Cluster Allocate Down PLND Dow Idle Reserved Reported
--------- -------- -------- -------- -------- -------- --------
# sacct
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
#
I can also see the following errors been logged in /var/log/slurmdbd:
[2018-06-05T16:04:03.865] DBD_JOB_START: cluster not registered
[2018-06-05T16:04:03.865] DBD_JOB_START: cluster not registered
[2018-06-05T16:04:03.866] DBD_JOB_START: cluster not registered
[2018-06-05T16:04:03.866] DBD_JOB_START: cluster not registered
[2018-06-05T16:04:03.866] DBD_JOB_START: cluster not registered
[2018-06-05T16:04:03.866] DBD_JOB_START: cluster not registered
[2018-06-05T16:04:03.866] DBD_JOB_START: cluster not registered
[2018-06-05T16:04:03.867] error: We should have gotten a new id: Table 'slurm_acct_db.slurm_cluster_job_table' doesn't exist
[2018-06-05T16:04:03.867] error: It looks like the storage has gone away trying to reconnect
[2018-06-05T16:04:03.867] error: We should have gotten a new id: Table 'slurm_acct_db.slurm_cluster_job_table' doesn't exist
[2018-06-05T16:04:03.867] DBD_JOB_START: cluster not registered
[2018-06-05T16:04:03.867] DBD_JOB_START: cluster not registered
The cluster name in the slurm.conf file is SLURM_CLUSTER:
# grep Cluster /etc/slurm/slurm.conf
ClusterName=SLURM_CLUSTER
But this does not match the cluster name that is in the output of "sacctmgr list cluster", which has the cluster name from the Webui cluster settings.
Is it possible to add an entry for 'slurm_cluster' with the 'sacctmgr add cluster' command? Would this correct the problem and can this be done without affected running SLURM jobs?
Ronan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180605/50d8b4ee/attachment.html>
More information about the slurm-users
mailing list