[slurm-users] How to fix “slurmd.service: Can't open PID file” error
mercan
ahmet.mercan at uhem.itu.edu.tr
Tue Jun 18 12:15:23 UTC 2019
Hi;
The owner of the /var/run/slurm-llnl directory and the slurmctld.pid and
slurmd.pid files should be "slurm" user. Your files owner are root and
noki.
chown -R slurm:slurm /var/spool/slurm-llnl
Regards;
Ahmet M.
On 18.06.2019 15:03, Noki Lee wrote:
>
> Though SLURM works fine for job submitting, running, and queueing, I
> got a minor error below.
>
> |sudo systemctl status slurmd|
>
> |Jun 12 10:20:40 noki-System-Product-Name systemd[1]: slurmd.service:
> Can't open PID file /var/run/slurm-llnl/slurmd.pid (yet?) after start:
> No such file or directory|
>
> |sudo systemctl status slurmctld|
>
> |Jun 12 10:20:40 noki-System-Product-Name systemd[1]: slurmd.service:
> Can't open PID file /var/run/slurm-llnl/slurmd.pid (yet?) after start:
> No such file or directory|
>
> I followed the installation of a guide from
>
> ftp://www.microway.com/pub/pub/for-customer/SDSU-Training/Webinar_2_Slurm_II--Ubuntu16.04_and_18.04.pdf
>
> This problem may come from the ownership of slurm.conf file?
>
> Here are my slurm.conf and ownership for slur*.pid
>
> |# slurm.conf file generated by configurator easy.html. # Put this
> file on all nodes of your cluster. # See the slurm.conf man page for
> more information. # ControlMachine=noki-System-Product-Name
> #ControlAddr= # #MailProg=/bin/mail MpiDefault=none
> #MpiParams=ports=#-# ProctrackType=proctrack/pgid ReturnToService=1
> SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid #SlurmctldPort=6817
> SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid #SlurmdPort=6818
> SlurmdSpoolDir=/var/spool/slurmd SlurmUser=noki #SlurmdUser=root
> StateSaveLocation=/var/spool/slurm-llnl SwitchType=switch/none
> TaskPlugin=task/none # # # TIMERS #KillWait=30 #MinJobAge=300
> #SlurmctldTimeout=120 #SlurmdTimeout=300 # # # SCHEDULING
> FastSchedule=1 SchedulerType=sched/backfill SelectType=select/linear
> #SelectTypeParameters= # # # LOGGING AND ACCOUNTING
> AccountingStorageType=accounting_storage/none ClusterName=linux
> #JobAcctGatherFrequency=30 JobAcctGatherType=jobacct_gather/none
> #SlurmctldDebug=3
> SlurmctldLogFile=/var/log/slurm-llnl/SlurmctldLogFile #SlurmdDebug=3
> SlurmdLogFile=/var/log/slurm-llnl/SlurmdLogFile # # # COMPUTE NODES
> NodeName=noki-System-Product-Name CPUs=4 RealMemory=6963 Sockets=1
> CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN PartitionName=debug
> Nodes=noki-System-Product-Name Default=YES MaxTime=INFINITE State=UP |
> |$ ls -l /var/run/slurm-llnl/ total 8 -rw-r--r-- 1 noki root 6 Jun 12
> 10:20 slurmctld.pid -rw-r--r-- 1 root root 6 Jun 12 10:20 slurmd.pid|
>
More information about the slurm-users
mailing list