[slurm-users] Slurm: Insane length message

BADREDDINE Alaa Alaa.BADREDDINE at cnrs.fr
Tue Sep 17 09:57:44 UTC 2019


I have the latest version of slurm (`slurm 20.02.0-0pre1`) installed on 4 mini debian test servers:

`test1` = `slurm master`

`test2` = `slurm backup`

`test3` = `slurmdbd`

`test4` = `slurm slave`

I am trying to make it work and I set up all my configuration files. The logs don't seem to mention anything special except on `test3` and `test4` and I am not sure what are causing the errors. Anyway, whenever I type `srun` or `squeue` `sinfo` or any slurm commands, I receive the following message: `slurm_load_partitions: Insane message length` or `slurm_load_jobs error: Insane message length`. I don't know what this means and I can't find anything related to this issue. Here, are my configurations:

[slurm.conf]

```
# slurm.conf file generated by configurator.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
SlurmctldHost=test1
SlurmctldHost=test2
#
ProctrackType=proctrack/cgroup
ReturnToService=2
SlurmctldPidFile=/media/Nextflow/slurm-llnl2/slurmctld.pid
SlurmctldPort=22
SlurmdPidFile=/var/run/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
StateSaveLocation=/media/Nextflow/slurm-llnl2
SwitchType=switch/none
TaskPlugin=cgroup,affinity
#
# TIMERS
InactiveLimit=0
KillWait=30
MinJobAge=300
SlurmctldTimeout=120
SlurmdTimeout=300
Waittime=0
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_CPU_Memory
#
# LOGGING AND ACCOUNTING
AccountingStorageHost=test3
AccountingStorageLoc=slurm_acct_db
AccountingStoragePort=22
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageUser=slurm
AccountingStoreJobComment=YES
ClusterName=cluster
JobCompHost=test3
JobCompPass=xxx
JobCompPort=3306
JobCompType=jobcomp/slurmdbd
JobCompUser=root
JobAcctGatherFrequency=60
JobAcctGatherType=jobacct_gather/linux
SlurmctldDebug=debug5
SlurmctldLogFile=/media/Nextflow/slurm-llnl2/slurmctld.log
SlurmdDebug=debug5
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
# COMPUTE NODES
NodeName=test4 CPUs=4 RealMemory=95000 Sockets=2 CoresPerSocket=2 ThreadsPerCore=1 State=UNKNOWN
PartitionName=cluster Nodes=test4 Default=YES MaxTime=INFINITE State=UP
```

[slurmdbd.conf]

```
#
# Sample /etc/slurmdbd.conf
#
ArchiveEvents=yes
ArchiveJobs=yes
ArchiveResvs=yes
ArchiveSteps=no
ArchiveSuspend=no
ArchiveTXN=no
ArchiveUsage=no
AuthInfo=/var/run/munge/munge.socket.2
AuthType=auth/munge
DbdHost=test3
DebugLevel=debug5
PurgeEventAfter=1month
PurgeJobAfter=12month
PurgeResvAfter=1month
PurgeStepAfter=1month
PurgeSuspendAfter=1month
PurgeTXNAfter=12month
PurgeUsageAfter=24month
LogFile=/var/log/slurmdbd.log
PidFile=/var/run/slurmdbd.pid
SlurmUser=slurm
StoragePass=xxx
StorageType=accounting_storage/mysql
StorageUser=slurm
```

[cgroup.conf]

```
###
# Slurm cgroup support configuration file
###
CgroupAutomount=yes
ConstrainCores=yes
TaskAffinity=no
#
```

and for the logs, I obtain the following:

[slurmctld.log]

```
root at test1:~# cat /media/Nextflow/slurm-llnl2/slurmctld.log
[2019-09-16T18:47:20.398] debug:  Log file re-opened
[2019-09-16T18:47:20.414] pidfile not locked, assuming no running daemon
[2019-09-16T18:47:20.420] debug:  sched: slurmctld starting
[2019-09-16T18:47:20.428] slurmctld version 20.02.0-0pre1 started on cluster cluster
[2019-09-16T18:47:20.430] debug3: Trying to load plugin /usr/lib/slurm/cred_munge.so
[2019-09-16T18:47:20.432] Munge credential signature plugin loaded
[2019-09-16T18:47:20.433] debug3: Success.
[2019-09-16T18:47:20.435] debug3: Trying to load plugin /usr/lib/slurm/auth_munge.so
[2019-09-16T18:47:20.436] debug:  Munge authentication plugin loaded
[2019-09-16T18:47:20.438] debug3: Success.
[2019-09-16T18:47:20.439] debug3: Trying to load plugin /usr/lib/slurm/select_cons_tres.so
[2019-09-16T18:47:20.442] select/cons_tres loaded with argument 17
[2019-09-16T18:47:20.443] debug3: Success.
[2019-09-16T18:47:20.444] debug3: Trying to load plugin /usr/lib/slurm/select_linear.so
[2019-09-16T18:47:20.446] Linear node selection plugin loaded with argument 17
[2019-09-16T18:47:20.447] debug3: Success.
[2019-09-16T18:47:20.448] debug3: Trying to load plugin /usr/lib/slurm/select_cons_res.so
[2019-09-16T18:47:20.449] select/cons_res loaded with argument 17
[2019-09-16T18:47:20.451] debug3: Success.
[2019-09-16T18:47:20.452] debug3: Trying to load plugin /usr/lib/slurm/select_cray_aries.so
[2019-09-16T18:47:20.453] Cray/Aries node selection plugin loaded
[2019-09-16T18:47:20.454] debug3: Success.
[2019-09-16T18:47:20.456] debug3: Trying to load plugin /usr/lib/slurm/preempt_none.so
[2019-09-16T18:47:20.457] preempt/none loaded
[2019-09-16T18:47:20.458] debug3: Success.
[2019-09-16T18:47:20.460] debug3: Trying to load plugin /usr/lib/slurm/checkpoint_none.so
[2019-09-16T18:47:20.461] debug3: Success.
[2019-09-16T18:47:20.462] debug:  Checkpoint plugin loaded: checkpoint/none
[2019-09-16T18:47:20.463] debug3: Trying to load plugin /usr/lib/slurm/acct_gather_energy_none.so
[2019-09-16T18:47:20.465] debug:  AcctGatherEnergy NONE plugin loaded
[2019-09-16T18:47:20.466] debug3: Success.
[2019-09-16T18:47:20.467] debug3: Trying to load plugin /usr/lib/slurm/acct_gather_profile_none.so
[2019-09-16T18:47:20.469] debug:  AcctGatherProfile NONE plugin loaded
[2019-09-16T18:47:20.470] debug3: Success.
[2019-09-16T18:47:20.471] debug3: Trying to load plugin /usr/lib/slurm/acct_gather_interconnect_none.so
[2019-09-16T18:47:20.473] debug:  AcctGatherInterconnect NONE plugin loaded
[2019-09-16T18:47:20.474] debug3: Success.
[2019-09-16T18:47:20.475] debug3: Trying to load plugin /usr/lib/slurm/acct_gather_filesystem_none.so
[2019-09-16T18:47:20.476] debug:  AcctGatherFilesystem NONE plugin loaded
[2019-09-16T18:47:20.477] debug3: Success.
[2019-09-16T18:47:20.479] debug2: Reading acct_gather.conf file /etc/slurm-llnl/acct_gather.conf
[2019-09-16T18:47:20.480] s_p_parse_file: file "/etc/slurm-llnl/acct_gather.conf" is empty
[2019-09-16T18:47:20.482] debug3: Trying to load plugin /usr/lib/slurm/jobacct_gather_linux.so
[2019-09-16T18:47:20.483] debug:  Job accounting gather LINUX plugin loaded
[2019-09-16T18:47:20.485] debug3: Success.
[2019-09-16T18:47:20.486] debug3: Trying to load plugin /usr/lib/slurm/ext_sensors_none.so
[2019-09-16T18:47:20.487] ExtSensors NONE plugin loaded
[2019-09-16T18:47:20.489] debug3: Success.
[2019-09-16T18:47:20.490] debug3: Trying to load plugin /usr/lib/slurm/switch_none.so
[2019-09-16T18:47:20.491] debug:  switch NONE plugin loaded
[2019-09-16T18:47:20.493] debug3: Success.
[2019-09-16T18:47:20.494] debug:  power_save module disabled, SuspendTime < 0
[2019-09-16T18:47:20.498] debug:  Requesting control from backup controller test2
[2019-09-16T18:47:20.533] error: _shutdown_bu_thread:send/recv test2: Insane message length
[2019-09-16T18:47:20.535] debug3: Trying to load plugin /usr/lib/slurm/accounting_storage_slurmdbd.so
[2019-09-16T18:47:20.537] Accounting storage SLURMDBD plugin loaded
[2019-09-16T18:47:20.538] debug3: Success.
[2019-09-16T18:47:24.610] error: Persistent Conn: only read 35 of 1397966893 bytes
[2019-09-16T18:47:30.679] error: Persistent Conn: only read 35 of 1397966893 bytes
[2019-09-16T18:47:31.749] error: Persistent Conn: only read 35 of 1397966893 bytes
[2019-09-16T18:47:36.058] error: Persistent Conn: only read 35 of 1397966893 bytes
[2019-09-16T18:47:39.505] error: Persistent Conn: only read 35 of 1397966893 bytes
```

[slurmdbd.log]

```
root at test3:~# cat /var/log/slurmdbd.log
[2019-09-16T18:46:56.514] debug:  Log file re-opened
[2019-09-16T18:46:56.514] pidfile not locked, assuming no running daemon
[2019-09-16T18:46:56.517] debug3: Trying to load plugin /usr/lib/slurm/auth_munge.so
[2019-09-16T18:46:56.518] debug:  Munge authentication plugin loaded
[2019-09-16T18:46:56.518] debug3: Success.
[2019-09-16T18:46:56.518] debug3: Trying to load plugin /usr/lib/slurm/accounting_storage_mysql.so
[2019-09-16T18:46:56.521] debug2: mysql_connect() called for db slurm_acct_db
[2019-09-16T18:46:56.523] debug2: Attempting to connect to localhost:3306
[2019-09-16T18:46:56.529] debug2: innodb_buffer_pool_size: 75161927680
[2019-09-16T18:46:56.531] debug2: innodb_log_file_size: 50331648
[2019-09-16T18:46:56.532] debug2: innodb_lock_wait_timeout: 600
[2019-09-16T18:46:56.556] debug4: 0(as_mysql_convert.c:560) query
select version from convert_version_table
[2019-09-16T18:46:56.556] debug4: as_mysql_convert_tables_pre_create: No conversion needed, Horray!
[2019-09-16T18:46:56.618] debug4: as_mysql_convert_non_cluster_tables_post_create: No conversion needed, Horray!
[2019-09-16T18:46:56.637] Accounting storage MYSQL plugin loaded
[2019-09-16T18:46:56.638] debug3: Success.
[2019-09-16T18:46:56.638] debug2: ArchiveDir        = /tmp
[2019-09-16T18:46:56.638] debug2: ArchiveScript     = (null)
[2019-09-16T18:46:56.638] debug2: AuthAltTypes      = (null)
[2019-09-16T18:46:56.638] debug2: AuthInfo          = /var/run/munge/munge.socket.2
[2019-09-16T18:46:56.638] debug2: AuthType          = auth/munge
[2019-09-16T18:46:56.638] debug2: CommitDelay       = 0
[2019-09-16T18:46:56.638] debug2: DbdAddr           = test3
[2019-09-16T18:46:56.638] debug2: DbdBackupHost     = (null)
[2019-09-16T18:46:56.638] debug2: DbdHost           = test3
[2019-09-16T18:46:56.639] debug2: DbdPort           = 6819
[2019-09-16T18:46:56.639] debug2: DebugFlags        = (null)
[2019-09-16T18:46:56.639] debug2: DebugLevel        = 9
[2019-09-16T18:46:56.639] debug2: DebugLevelSyslog  = 10
[2019-09-16T18:46:56.639] debug2: DefaultQOS        = (null)
[2019-09-16T18:46:56.639] debug2: LogFile           = /var/log/slurmdbd.log
[2019-09-16T18:46:56.639] debug2: MessageTimeout    = 10
[2019-09-16T18:46:56.639] debug2: Parameters        = (null)
[2019-09-16T18:46:56.639] debug2: PidFile           = /var/run/slurmdbd.pid
[2019-09-16T18:46:56.639] debug2: PluginDir         = /usr/lib/slurm
[2019-09-16T18:46:56.639] debug2: PrivateData       = none
[2019-09-16T18:46:56.639] debug2: PurgeEventAfter   = 1 months*
[2019-09-16T18:46:56.639] debug2: PurgeJobAfter     = 12 months*
[2019-09-16T18:46:56.639] debug2: PurgeResvAfter    = 1 months*
[2019-09-16T18:46:56.639] debug2: PurgeStepAfter    = 1 months
[2019-09-16T18:46:56.639] debug2: PurgeSuspendAfter = 1 months
[2019-09-16T18:46:56.639] debug2: PurgeTXNAfter = 12 months
[2019-09-16T18:46:56.639] debug2: PurgeUsageAfter = 24 months
[2019-09-16T18:46:56.639] debug2: SlurmUser         = slurm(982)
[2019-09-16T18:46:56.639] debug2: StorageBackupHost = (null)
[2019-09-16T18:46:56.639] debug2: StorageHost       = localhost
[2019-09-16T18:46:56.639] debug2: StorageLoc        = slurm_acct_db
[2019-09-16T18:46:56.639] debug2: StoragePort       = 3306
[2019-09-16T18:46:56.639] debug2: StorageType       = accounting_storage/mysql
[2019-09-16T18:46:56.639] debug2: StorageUser       = slurm
[2019-09-16T18:46:56.639] debug2: TCPTimeout        = 2
[2019-09-16T18:46:56.639] debug2: TrackWCKey        = 0
[2019-09-16T18:46:56.640] debug2: TrackSlurmctldDown= 0
[2019-09-16T18:46:56.640] debug2: acct_storage_p_get_connection: request new connection 1
[2019-09-16T18:46:56.640] debug2: Attempting to connect to localhost:3306
[2019-09-16T18:46:56.645] slurmdbd version 20.02.0-0pre1 started
[2019-09-16T18:46:56.646] debug2: running rollup at Mon Sep 16 18:46:56 2019
[2019-09-16T18:46:56.646] error: Error binding slurm stream socket: Address already in use
[2019-09-16T18:46:56.646] fatal: slurm_init_msg_engine_port error Address already in use
```

[slurmd.log]

```
root at test4:~# cat /var/log/slurm-llnl/slurmd.log
[2019-09-16T18:42:03.692] debug3: CPUs=4 Boards=1 Sockets=2 Cores=2 Threads=1 Memory=98357 TmpDisk=34058 Uptime=32212 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(null)
[2019-09-16T18:42:03.716] error: Unable to register: Insane message length
[2019-09-16T18:42:03.716] debug:  Unable to register with slurm controller, retrying
[2019-09-16T18:42:04.716] debug3: CPUs=4 Boards=1 Sockets=2 Cores=2 Threads=1 Memory=98357 TmpDisk=34058 Uptime=32213 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(null)
[2019-09-16T18:42:04.740] error: Unable to register: Insane message length
[2019-09-16T18:42:04.740] debug:  Unable to register with slurm controller, retrying
[2019-09-16T18:42:05.229] debug:  Log file re-opened
[2019-09-16T18:42:05.229] debug2: hwloc_topology_init
[2019-09-16T18:42:05.243] debug2: hwloc_topology_load
[2019-09-16T18:42:05.256] debug2: hwloc_topology_export_xml
[2019-09-16T18:42:05.258] debug:  CPUs:4 Boards:1 Sockets:2 CoresPerSocket:2 ThreadsPerCore:1
[2019-09-16T18:42:05.258] debug4: CPU map[0]=>0 S:C:T 0:0:0
[2019-09-16T18:42:05.258] debug4: CPU map[1]=>1 S:C:T 0:1:0
[2019-09-16T18:42:05.258] debug4: CPU map[2]=>2 S:C:T 1:0:0
[2019-09-16T18:42:05.258] debug4: CPU map[3]=>3 S:C:T 1:1:0
[2019-09-16T18:42:05.259] Message aggregation disabled
[2019-09-16T18:42:05.259] debug:  Reading cgroup.conf file /etc/slurm-llnl/cgroup.conf
[2019-09-16T18:42:05.259] debug3: initializing slurmd spool directory
[2019-09-16T18:42:05.259] debug2: hwloc_topology_init
[2019-09-16T18:42:05.263] debug2: xcpuinfo_hwloc_topo_load: xml file (/var/spool/slurmd/hwloc_topo_whole.xml) found
[2019-09-16T18:42:05.265] debug:  CPUs:4 Boards:1 Sockets:2 CoresPerSocket:2 ThreadsPerCore:1
[2019-09-16T18:42:05.265] debug4: CPU map[0]=>0 S:C:T 0:0:0
[2019-09-16T18:42:05.265] debug4: CPU map[1]=>1 S:C:T 0:1:0
[2019-09-16T18:42:05.265] debug4: CPU map[2]=>2 S:C:T 1:0:0
[2019-09-16T18:42:05.265] debug4: CPU map[3]=>3 S:C:T 1:1:0
[2019-09-16T18:42:05.265] debug3: Trying to load plugin /usr/lib/slurm/topology_none.so
[2019-09-16T18:42:05.266] topology NONE plugin loaded
[2019-09-16T18:42:05.266] debug3: Success.
[2019-09-16T18:42:05.266] debug3: Trying to load plugin /usr/lib/slurm/route_default.so
[2019-09-16T18:42:05.266] route default plugin loaded
[2019-09-16T18:42:05.266] debug3: Success.
[2019-09-16T18:42:05.269] CPU frequency setting not configured for this node
[2019-09-16T18:42:05.269] debug:  Resource spec: No specialized cores configured by default on this node
[2019-09-16T18:42:05.269] debug:  Resource spec: Reserved system memory limit not configured for this node
[2019-09-16T18:42:05.269] debug3: Trying to load plugin /usr/lib/slurm/proctrack_cgroup.so
[2019-09-16T18:42:05.270] debug:  Reading cgroup.conf file /etc/slurm-llnl/cgroup.conf
[2019-09-16T18:42:05.270] debug3: Success.
[2019-09-16T18:42:05.270] debug3: Trying to load plugin /usr/lib/slurm/task_cgroup.so
[2019-09-16T18:42:05.271] debug:  task/cgroup: now constraining jobs allocated cores
[2019-09-16T18:42:05.271] debug:  task/cgroup: loaded
[2019-09-16T18:42:05.271] debug3: Success.
[2019-09-16T18:42:05.271] debug3: Trying to load plugin /usr/lib/slurm/task_affinity.so
[2019-09-16T18:42:05.271] debug3: sched_getaffinity(0) = 0xf
[2019-09-16T18:42:05.271] task affinity plugin loaded with CPU mask 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000f
[2019-09-16T18:42:05.271] debug3: Success.
[2019-09-16T18:42:05.271] debug3: Trying to load plugin /usr/lib/slurm/auth_munge.so
[2019-09-16T18:42:05.272] debug:  Munge authentication plugin loaded
[2019-09-16T18:42:05.272] debug3: Success.
[2019-09-16T18:42:05.272] debug:  spank: opening plugin stack /etc/slurm-llnl/plugstack.conf
[2019-09-16T18:42:05.272] debug3: Trying to load plugin /usr/lib/slurm/cred_munge.so
[2019-09-16T18:42:05.272] Munge credential signature plugin loaded
[2019-09-16T18:42:05.272] debug3: Success.
[2019-09-16T18:42:05.272] debug3: slurmd initialization successful
[2019-09-16T18:42:05.274] slurmd version 20.02.0-0pre1 started
[2019-09-16T18:42:05.275] debug3: finished daemonize
[2019-09-16T18:42:05.275] killing old slurmd[11797]
[2019-09-16T18:42:05.275] got shutdown request
[2019-09-16T18:42:05.275] waiting on 1 active threads
[2019-09-16T18:42:05.276] debug3: Trying to load plugin /usr/lib/slurm/jobacct_gather_linux.so
[2019-09-16T18:42:05.276] debug:  Job accounting gather LINUX plugin loaded
[2019-09-16T18:42:05.276] debug3: Success.
[2019-09-16T18:42:05.276] debug3: Trying to load plugin /usr/lib/slurm/job_container_none.so
[2019-09-16T18:42:05.277] debug:  job_container none plugin loaded
[2019-09-16T18:42:05.277] debug3: Success.
[2019-09-16T18:42:05.277] debug3: Trying to load plugin /usr/lib/slurm/core_spec_none.so
[2019-09-16T18:42:05.277] debug3: Success.
[2019-09-16T18:42:05.277] debug3: Trying to load plugin /usr/lib/slurm/switch_none.so
[2019-09-16T18:42:05.277] debug:  switch NONE plugin loaded
[2019-09-16T18:42:05.277] debug3: Success.
[2019-09-16T18:42:05.278] debug3: successfully opened slurm listen port *:6818
[2019-09-16T18:42:05.278] slurmd started on Mon, 16 Sep 2019 18:42:05 +0200
[2019-09-16T18:42:05.278] CPUs=4 Boards=1 Sockets=2 Cores=2 Threads=1 Memory=98357 TmpDisk=34058 Uptime=32213 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(null)
[2019-09-16T18:42:05.279] debug3: Trying to load plugin /usr/lib/slurm/acct_gather_energy_none.so
[2019-09-16T18:42:05.279] debug:  AcctGatherEnergy NONE plugin loaded
[2019-09-16T18:42:05.279] debug3: Success.
[2019-09-16T18:42:05.279] debug3: Trying to load plugin /usr/lib/slurm/acct_gather_profile_none.so
[2019-09-16T18:42:05.280] debug:  AcctGatherProfile NONE plugin loaded
[2019-09-16T18:42:05.280] debug3: Success.
[2019-09-16T18:42:05.280] debug3: Trying to load plugin /usr/lib/slurm/acct_gather_interconnect_none.so
[2019-09-16T18:42:05.280] debug:  AcctGatherInterconnect NONE plugin loaded
[2019-09-16T18:42:05.280] debug3: Success.
[2019-09-16T18:42:05.280] debug3: Trying to load plugin /usr/lib/slurm/acct_gather_filesystem_none.so
[2019-09-16T18:42:05.280] debug:  AcctGatherFilesystem NONE plugin loaded
[2019-09-16T18:42:05.280] debug3: Success.
[2019-09-16T18:42:05.280] debug2: Reading acct_gather.conf file /etc/slurm-llnl/acct_gather.conf
[2019-09-16T18:42:05.281] s_p_parse_file: file "/etc/slurm-llnl/acct_gather.conf" is empty
[2019-09-16T18:42:05.304] error: Unable to register: Insane message length
[2019-09-16T18:42:05.304] debug:  Unable to register with slurm controller, retrying
[2019-09-16T18:42:05.740] all threads complete
[2019-09-16T18:42:05.742] debug3: xcgroup_set_uint32_param: parameter 'cgroup.procs' set to '11797' for '/sys/fs/cgroup/cpuset'
[2019-09-16T18:42:05.742] debug2: _file_read_uint32s: unable to open '(null)/tasks' for reading : No such file or directory
[2019-09-16T18:42:05.742] debug2: xcgroup_get_pids: unable to get pids of '(null)'
[2019-09-16T18:42:05.742] debug3: Took 0 checks before stepd pid 11797 was removed from the cpuset step cgroup.
[2019-09-16T18:42:05.742] debug:  task affinity plugin unloaded
[2019-09-16T18:42:05.743] debug3: xcgroup_set_uint32_param: parameter 'cgroup.procs' set to '11797' for '/sys/fs/cgroup/freezer'
[2019-09-16T18:42:05.743] debug2: _file_read_uint32s: unable to open '(null)/tasks' for reading : No such file or directory
[2019-09-16T18:42:05.743] debug2: xcgroup_get_pids: unable to get pids of '(null)'
[2019-09-16T18:42:05.743] debug3: Took 0 checks before stepd pid 11797 was removed from the freezer job cgroup.
[2019-09-16T18:42:05.743] select/cons_res shutting down ...
[2019-09-16T18:42:05.744] Munge credential signature plugin unloaded
[2019-09-16T18:42:05.744] Slurmd shutdown completing
[2019-09-16T18:42:06.305] debug3: CPUs=4 Boards=1 Sockets=2 Cores=2 Threads=1 Memory=98357 TmpDisk=34058 Uptime=32214 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(null)
[2019-09-16T18:42:06.329] error: Unable to register: Insane message length
[2019-09-16T18:42:06.329] debug:  Unable to register with slurm controller, retrying
root at test4:~# > /var/log/slurm-llnl/slurmd.log
root at test4:~# cat /var/log/slurm-llnl/slurmd.log
[2019-09-16T18:47:04.289] debug3: CPUs=4 Boards=1 Sockets=2 Cores=2 Threads=1 Memory=98357 TmpDisk=34058 Uptime=32512 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(null)
[2019-09-16T18:47:04.313] error: Unable to register: Insane message length
[2019-09-16T18:47:04.313] debug:  Unable to register with slurm controller, retrying
root at test4:~# cat /var/log/slurm-llnl/slurmd.log
[2019-09-16T18:47:04.289] debug3: CPUs=4 Boards=1 Sockets=2 Cores=2 Threads=1 Memory=98357 TmpDisk=34058 Uptime=32512 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(null)
[2019-09-16T18:47:04.313] error: Unable to register: Insane message length
[2019-09-16T18:47:04.313] debug:  Unable to register with slurm controller, retrying
[2019-09-16T18:47:05.314] debug3: CPUs=4 Boards=1 Sockets=2 Cores=2 Threads=1 Memory=98357 TmpDisk=34058 Uptime=32513 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(null)
[2019-09-16T18:47:05.336] error: Unable to register: Insane message length
[2019-09-16T18:47:05.337] debug:  Unable to register with slurm controller, retrying
[2019-09-16T18:47:06.337] debug3: CPUs=4 Boards=1 Sockets=2 Cores=2 Threads=1 Memory=98357 TmpDisk=34058 Uptime=32514 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(null)
[2019-09-16T18:47:06.361] error: Unable to register: Insane message length
[2019-09-16T18:47:06.362] debug:  Unable to register with slurm controller, retrying
root at test4:~# cat /var/log/slurm-llnl/slurmd.log
[2019-09-16T18:47:04.289] debug3: CPUs=4 Boards=1 Sockets=2 Cores=2 Threads=1 Memory=98357 TmpDisk=34058 Uptime=32512 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(null)
```

Does anyone have any idea what is the issue with this message please ?

Thanks in advance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190917/d68fb1b8/attachment-0001.htm>


More information about the slurm-users mailing list