[2018-05-21T01:49:33.608] debug: Log file re-opened [2018-05-21T01:49:33.609] debug: sched: slurmctld starting [2018-05-21T01:49:33.609] Warning: Core limit is only 0 KB [2018-05-21T01:49:33.609] slurmctld version 17.11.6 started on cluster cluster [2018-05-21T01:49:33.610] Munge cryptographic signature plugin loaded [2018-05-21T01:49:33.610] Consumable Resources (CR) Node Selection plugin loaded with argument 4 [2018-05-21T01:49:33.610] preempt/none loaded [2018-05-21T01:49:33.610] debug: Checkpoint plugin loaded: checkpoint/none [2018-05-21T01:49:33.610] debug: AcctGatherEnergy NONE plugin loaded [2018-05-21T01:49:33.610] debug: AcctGatherProfile NONE plugin loaded [2018-05-21T01:49:33.610] debug: AcctGatherInterconnect NONE plugin loaded [2018-05-21T01:49:33.610] debug: AcctGatherFilesystem NONE plugin loaded [2018-05-21T01:49:33.610] debug2: No acct_gather.conf file (/usr/local/etc/acct_gather.conf) [2018-05-21T01:49:33.610] debug: Job accounting gather NOT_INVOKED plugin loaded [2018-05-21T01:49:33.610] ExtSensors NONE plugin loaded [2018-05-21T01:49:33.610] debug: switch NONE plugin loaded [2018-05-21T01:49:33.610] debug: power_save module disabled, SuspendTime < 0 [2018-05-21T01:49:33.611] debug: No backup controller to shutdown [2018-05-21T01:49:33.611] Accounting storage NOT INVOKED plugin loaded [2018-05-21T01:49:33.611] debug: Reading slurm.conf file: /usr/local/etc/slurm.conf [2018-05-21T01:49:33.612] layouts: no layout to initialize [2018-05-21T01:49:33.612] topology NONE plugin loaded [2018-05-21T01:49:33.612] debug: No DownNodes [2018-05-21T01:49:33.613] debug: Log file re-opened [2018-05-21T01:49:33.613] sched: Backfill scheduler plugin loaded [2018-05-21T01:49:33.614] route default plugin loaded [2018-05-21T01:49:33.614] layouts: loading entities/relations information [2018-05-21T01:49:33.614] debug: layouts: 3/3 nodes in hash table, rc=0 [2018-05-21T01:49:33.614] debug: layouts: loading stage 1 [2018-05-21T01:49:33.614] debug: layouts: loading stage 1.1 (restore state) [2018-05-21T01:49:33.614] debug: layouts: loading stage 2 [2018-05-21T01:49:33.614] debug: layouts: loading stage 3 [2018-05-21T01:49:33.614] Recovered state of 3 nodes [2018-05-21T01:49:33.614] Down nodes: triumph[02-03] [2018-05-21T01:49:33.614] Recovered information about 0 jobs [2018-05-21T01:49:33.614] cons_res: select_p_node_init [2018-05-21T01:49:33.615] cons_res: preparing for 1 partitions [2018-05-21T01:49:33.615] debug2: init_requeue_policy: kill_invalid_depend is set to 0 [2018-05-21T01:49:33.615] debug: Updating partition uid access list [2018-05-21T01:49:33.615] Recovered state of 0 reservations [2018-05-21T01:49:33.615] State of 0 triggers recovered [2018-05-21T01:49:33.615] _preserve_plugins: backup_controller not specified [2018-05-21T01:49:33.615] cons_res: select_p_reconfigure [2018-05-21T01:49:33.615] cons_res: select_p_node_init [2018-05-21T01:49:33.615] cons_res: preparing for 1 partitions [2018-05-21T01:49:33.615] Running as primary controller [2018-05-21T01:49:33.615] debug: No BackupController, not launching heartbeat. [2018-05-21T01:49:33.615] debug: Priority BASIC plugin loaded [2018-05-21T01:49:33.615] No parameter for mcs plugin, default values set [2018-05-21T01:49:33.615] mcs: MCSParameters = (null). ondemand set. [2018-05-21T01:49:33.615] debug: mcs none plugin loaded [2018-05-21T01:49:33.616] debug: power_save mode not enabled [2018-05-21T01:49:33.616] debug2: slurmctld listening on 0.0.0.0:6817 [2018-05-21T01:49:36.619] debug: Spawning registration agent for triumph[01-03] 3 hosts [2018-05-21T01:49:36.619] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS [2018-05-21T01:49:36.619] debug2: got 1 threads to send out [2018-05-21T01:49:36.619] debug2: Tree head got back 0 looking for 3 [2018-05-21T01:49:36.620] debug: Munge authentication plugin loaded [2018-05-21T01:49:36.621] debug2: Tree head got back 1 [2018-05-21T01:49:36.621] debug2: Processing RPC: MESSAGE_NODE_REGISTRATION_STATUS from uid=0 [2018-05-21T01:49:36.621] debug: validate_node_specs: node triumph01 registered with 0 jobs [2018-05-21T01:49:36.621] debug2: _slurm_rpc_node_registration complete for triumph01 usec=90 [2018-05-21T01:49:37.620] SchedulerParameters=default_queue_depth=100,max_rpc_cnt=0,max_sched_time=2,partition_job_depth=0,sched_max_job_start=0,sched_min_interval=2 [2018-05-21T01:49:37.620] debug: sched: Running job scheduler [2018-05-21T01:49:38.622] debug2: slurm_connect poll timeout: Connection timed out [2018-05-21T01:49:38.622] debug2: Error connecting slurm stream socket at 192.168.0.22:6818: Connection timed out [2018-05-21T01:49:38.622] debug2: slurm_connect poll timeout: Connection timed out [2018-05-21T01:49:38.622] debug2: Error connecting slurm stream socket at 192.168.0.23:6818: Connection timed out [2018-05-21T01:49:38.622] debug2: Tree head got back 2 [2018-05-21T01:49:38.622] debug2: Tree head got back 2 [2018-05-21T01:49:38.622] debug2: Tree head got back 3 [2018-05-21T01:49:38.622] debug2: Tree head got back 3 [2018-05-21T01:49:38.622] agent/is_node_resp: node:triumph02 RPC:REQUEST_NODE_REGISTRATION_STATUS : Communication connection failure [2018-05-21T01:49:38.622] agent/is_node_resp: node:triumph03 RPC:REQUEST_NODE_REGISTRATION_STATUS : Communication connection failure [2018-05-21T01:49:38.895] debug2: node_did_resp triumph01 [2018-05-21T01:49:38.895] debug2: agent maximum delay 2 seconds