[slurm-users] How to fix “slurmd.service: Can't open PID file” error
Noki Lee
noki.lee21 at gmail.com
Tue Jun 18 12:03:59 UTC 2019
Though SLURM works fine for job submitting, running, and queueing, I got a
minor error below.
sudo systemctl status slurmd
Jun 12 10:20:40 noki-System-Product-Name systemd[1]: slurmd.service: Can't
open PID file /var/run/slurm-llnl/slurmd.pid (yet?) after start: No such
file or directory
sudo systemctl status slurmctld
Jun 12 10:20:40 noki-System-Product-Name systemd[1]: slurmd.service: Can't
open PID file /var/run/slurm-llnl/slurmd.pid (yet?) after start: No such
file or directory
I followed the installation of a guide from
ftp://www.microway.com/pub/pub/for-customer/SDSU-Training/Webinar_2_Slurm_II--Ubuntu16.04_and_18.04.pdf
This problem may come from the ownership of slurm.conf file?
Here are my slurm.conf and ownership for slur*.pid
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ControlMachine=noki-System-Product-Name
#ControlAddr=
#
#MailProg=/bin/mail
MpiDefault=none
#MpiParams=ports=#-#
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=noki
#SlurmdUser=root
StateSaveLocation=/var/spool/slurm-llnl
SwitchType=switch/none
TaskPlugin=task/none
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
SelectType=select/linear
#SelectTypeParameters=
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=linux
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/none
#SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm-llnl/SlurmctldLogFile
#SlurmdDebug=3
SlurmdLogFile=/var/log/slurm-llnl/SlurmdLogFile
#
#
# COMPUTE NODES
NodeName=noki-System-Product-Name CPUs=4 RealMemory=6963 Sockets=1
CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
PartitionName=debug Nodes=noki-System-Product-Name Default=YES
MaxTime=INFINITE State=UP
$ ls -l /var/run/slurm-llnl/
total 8
-rw-r--r-- 1 noki root 6 Jun 12 10:20 slurmctld.pid
-rw-r--r-- 1 root root 6 Jun 12 10:20 slurmd.pid
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190618/7de20d72/attachment.html>
More information about the slurm-users
mailing list