[slurm-users] When I start slurmctld, there are some errors in log.

John Hearns hearnsj at googlemail.com
Fri Jun 15 03:42:49 MDT 2018


Please do three things for the list:

a) cat /etc/*elease*

b) give details on how Slurm was installed on the master node and the
compute nodes

c) How was your slurm.conf file created? Is this file identical on master
node and compute nodes?



On 15 June 2018 at 11:26, UGI <ugiwgh at gmail.com> wrote:

> I didn't have the directory /var/spool/slurmctld/.  And then I mkdir the
> directory, and "chown slurm:slurm /var/spool/slurmctld".
> But there is also the errors.
>
> 2018-06-15 16:00 GMT+08:00 John Hearns <hearnsj at googlemail.com>:
>
>> And your permissions on the directory /var/spool/slurmctld/  are ????
>>
>> On 15 June 2018 at 09:11, UGI <ugiwgh at gmail.com> wrote:
>>
>>> When I start slurmctld, there are some errors in log. And the job
>>> running information doesn't store to mysql via slurmdbd.
>>>
>>> I set
>>>
>>> AccountingStoragePass=/usr/local/munge-munge-0.5.13/var/run/
>>> munge/munge.socket.2
>>>
>>> AccountingStorageType=accounting_storage/slurmdbd
>>>
>>> JobAcctGatherType=jobacct_gather/linux
>>>
>>> in slurm.conf.
>>>
>>>
>>> The following message is the log which slurmctld output.
>>>
>>> [2018-06-15T11:05:44.763] Terminate signal (SIGINT or SIGTERM) received
>>>
>>> [2018-06-15T11:05:44.807] Saving all slurm state
>>>
>>> [2018-06-15T11:05:45.101] error: slurmdbd: Sending fini msg: No error
>>>
>>> [2018-06-15T11:05:45.126] layouts: all layouts are now unloaded.
>>>
>>> [2018-06-15T11:06:07.761] slurmctld version 17.11.7 started on cluster
>>> myslurm
>>>
>>> [2018-06-15T11:06:07.785] error: slurm_persist_conn_open_without_init:
>>> failed to open persistent connection to localhost:6819: Connection refused
>>>
>>> [2018-06-15T11:06:07.785] error: slurmdbd: Sending PersistInit msg:
>>> Connection refused
>>>
>>> [2018-06-15T11:06:07.785] error: slurmdbd: Sending PersistInit msg:
>>> Connection refused
>>>
>>> [2018-06-15T11:06:07.787] layouts: no layout to initialize
>>>
>>> [2018-06-15T11:06:07.824] error: ##############################
>>> ##################
>>>
>>> [2018-06-15T11:06:07.824] error: ###       SEVERE SECURITY VULERABILTY
>>>       ###
>>>
>>> [2018-06-15T11:06:07.824] error: ### StateSaveLocation DIRECTORY IS
>>> WORLD WRITABLE ###
>>>
>>> [2018-06-15T11:06:07.824] error: ###         CORRECT FILE PERMISSIONS
>>>       ###
>>>
>>> [2018-06-15T11:06:07.824] error: ##############################
>>> ##################
>>>
>>> [2018-06-15T11:06:07.824] layouts: loading entities/relations information
>>>
>>> [2018-06-15T11:06:07.824] Recovered state of 1 nodes
>>>
>>> [2018-06-15T11:06:07.824] Recovered JobID=12 State=0x3 NodeCnt=0 Assoc=2
>>>
>>> [2018-06-15T11:06:07.825] Recovered information about 1 jobs
>>>
>>> [2018-06-15T11:06:07.825] cons_res: select_p_node_init
>>>
>>> [2018-06-15T11:06:07.825] cons_res: preparing for 1 partitions
>>>
>>> [2018-06-15T11:06:07.825] Recovered state of 0 reservations
>>>
>>> [2018-06-15T11:06:07.825] _preserve_plugins: backup_controller not
>>> specified
>>>
>>> [2018-06-15T11:06:07.825] cons_res: select_p_reconfigure
>>>
>>> [2018-06-15T11:06:07.825] cons_res: select_p_node_init
>>>
>>> [2018-06-15T11:06:07.825] cons_res: preparing for 1 partitions
>>>
>>> [2018-06-15T11:06:07.825] Running as primary controller
>>>
>>> [2018-06-15T11:06:07.825] Registering slurmctld at port 6817 with
>>> slurmdbd.
>>>
>>> [2018-06-15T11:06:07.825] error: slurmdbd: Sending PersistInit msg:
>>> Connection refused
>>>
>>> [2018-06-15T11:06:07.825] error: slurmdbd: Sending PersistInit msg:
>>> Connection refused
>>>
>>> [2018-06-15T11:06:07.826] No parameter for mcs plugin, default values set
>>>
>>> [2018-06-15T11:06:07.826] mcs: MCSParameters = (null). ondemand set.
>>>
>>> [2018-06-15T11:06:10.829] SchedulerParameters=default_qu
>>> eue_depth=100,max_rpc_cnt=0,max_sched_time=2,partition_job_d
>>> epth=0,sched_max_job_start=0,sched_min_interval=2
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180615/41df775d/attachment-0001.html>


More information about the slurm-users mailing list