[slurm-users] systemctl enable slurmd.service Failed to execute operation: No such file or directory
Nousheen
nousheenparvaiz at gmail.com
Tue Feb 1 04:06:13 UTC 2022
Dear Ole,
Thank you for your response.
I am doing it again using your suggested link.
Best Regards,
Nousheen Parvaiz
ᐧ
On Mon, Jan 31, 2022 at 2:07 PM Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk>
wrote:
> Hi Nousheen,
>
> I recommend you again to follow the steps for installing Slurm on a CentOS
> 7 cluster:
> https://wiki.fysik.dtu.dk/niflheim/Slurm_installation
>
> Maybe you will need to start installation from scratch, but the steps are
> guaranteed to work if followed correctly.
>
> IHTH,
> Ole
>
> On 1/31/22 06:23, Nousheen wrote:
> > The same error shows up on compute node which is as follows:
> >
> > [root at c103008 ~]# systemctl enable slurmd.service
> > [root at c103008 ~]# systemctl start slurmd.service
> > [root at c103008 ~]# systemctl status slurmd.service
> > ● slurmd.service - Slurm node daemon
> > Loaded: loaded (/etc/systemd/system/slurmd.service; enabled; vendor
> > preset: disabled)
> > Active: failed (Result: exit-code) since Mon 2022-01-31 00:22:42
> EST;
> > 2s ago
> > Process: 11505 ExecStart=/usr/local/sbin/slurmd -D -s $SLURMD_OPTIONS
> > (code=exited, status=203/EXEC)
> > Main PID: 11505 (code=exited, status=203/EXEC)
> >
> > Jan 31 00:22:42 c103008 systemd[1]: Started Slurm node daemon.
> > Jan 31 00:22:42 c103008 systemd[1]: slurmd.service: main process exited,
> > code=exited, status=203/EXEC
> > Jan 31 00:22:42 c103008 systemd[1]: Unit slurmd.service entered failed
> state.
> > Jan 31 00:22:42 c103008 systemd[1]: slurmd.service failed.
> >
> >
> > Best Regards,
> > Nousheen Parvaiz
> >
> >
> > ᐧ
> >
> > On Mon, Jan 31, 2022 at 10:08 AM Nousheen <nousheenparvaiz at gmail.com
> > <mailto:nousheenparvaiz at gmail.com>> wrote:
> >
> > Dear Jeffrey,
> >
> > Thank you for your response. I have followed the steps as instructed.
> > After the copying the files to their respective locations "systemctl
> > status slurmctld.service" command gives me an error as follows:
> >
> > (base) [nousheen at exxact system]$ systemctl daemon-reload
> > (base) [nousheen at exxact system]$ systemctl enable slurmctld.service
> > (base) [nousheen at exxact system]$ systemctl start slurmctld.service
> > (base) [nousheen at exxact system]$ systemctl status slurmctld.service
> > ● slurmctld.service - Slurm controller daemon
> > Loaded: loaded (/etc/systemd/system/slurmctld.service; enabled;
> > vendor preset: disabled)
> > Active: failed (Result: exit-code) since Mon 2022-01-31 10:04:31
> > PKT; 3s ago
> > Process: 18114 ExecStart=/usr/local/sbin/slurmctld -D -s
> > $SLURMCTLD_OPTIONS (code=exited, status=1/FAILURE)
> > Main PID: 18114 (code=exited, status=1/FAILURE)
> >
> > Jan 31 10:04:31 exxact systemd[1]: Started Slurm controller daemon.
> > Jan 31 10:04:31 exxact systemd[1]: slurmctld.service: main process
> > exited, code=exited, status=1/FAILURE
> > Jan 31 10:04:31 exxact systemd[1]: Unit slurmctld.service entered
> > failed state.
> > Jan 31 10:04:31 exxact systemd[1]: slurmctld.service failed.
> >
> > Kindly guide me. Thank you so much for your time.
> >
> > Best Regards,
> > Nousheen Parvaiz
> >
> > ᐧ
> >
> > On Thu, Jan 27, 2022 at 8:25 PM Jeffrey R. Lang <JRLang at uwyo.edu
> > <mailto:JRLang at uwyo.edu>> wrote:
> >
> > The missing file error has nothing to do with slurm. The
> > systemctl command is part of the systems service management.____
> >
> > __ __
> >
> > The error message indicates that you haven’t copied the
> > slurmd.service file on your compute node to /etc/systemd/system
> or
> > /usr/lib/systemd/system. /etc/systemd/system is usually used
> when
> > a user adds a new service to a machine.____
> >
> > __ __
> >
> > Depending on your version of Linux you may also need to do a
> > systemctl daemon-reload to activate the slurmd.service within
> > system.____
> >
> > __ __
> >
> > Once slurmd.service is copied over, the systemctld command should
> > work just fine.____
> >
> > __ __
> >
> > Remember:____
> >
> > slurmd.service - Only on compute nodes____
> >
> > slurmctld.service – Only on your cluster
> > management node____
> >
> > slurmdbd.service – Only on your cluster management
> > node____
> >
> > __ __
> >
> > *From:* slurm-users <slurm-users-bounces at lists.schedmd.com
> > <mailto:slurm-users-bounces at lists.schedmd.com>> *On Behalf Of
> > *Nousheen
> > *Sent:* Thursday, January 27, 2022 3:54 AM
> > *To:* Slurm User Community List <slurm-users at lists.schedmd.com
> > <mailto:slurm-users at lists.schedmd.com>>
> > *Subject:* [slurm-users] systemctl enable slurmd.service Failed
> to
> > execute operation: No such file or directory____
> >
> > __ __
> >
> > ◆ This message was sent from a non-UWYO address. Please exercise
> > caution when clicking links or opening attachments from external
> > sources.____
> >
> > __ __
> >
> > __ __
> >
> > Hello everyone,____
> >
> > __ __
> >
> > I am installing slurm on Centos 7 following tutorial:
> >
> https://www.slothparadise.com/how-to-install-slurm-on-centos-7-cluster/
> > <
> https://www.slothparadise.com/how-to-install-slurm-on-centos-7-cluster/
> >____
> >
> > __ __
> >
> > I am at the step where we start slurm but it gives me the
> > following error:____
> >
> > __ __
> >
> > [root at exxact slurm-21.08.5]# systemctl enable slurmd.service____
> >
> > Failed to execute operation: No such file or directory____
> >
> > __ __
> >
> > I have run the command to check if slurm is configured
> properly____
> >
> > __ __
> >
> > [root at exxact slurm-21.08.5]# slurmd -C
> > NodeName=exxact CPUs=12 Boards=1 SocketsPerBoard=1
> > CoresPerSocket=6 ThreadsPerCore=2 RealMemory=31889
> > UpTime=19-16:06:00____
> >
> > __ __
> >
> > I am new to this and unable to understand the problem. Kindly
> help
> > me resolve this.____
> >
> > __ __
> >
> > My slurm.conf file is as follows:____
> >
> > __ __
> >
> > # slurm.conf file generated by configurator.html.
> > # Put this file on all nodes of your cluster.
> > # See the slurm.conf man page for more information.
> > #
> > ClusterName=cluster194
> > SlurmctldHost=192.168.60.194
> > #SlurmctldHost=
> > #
> > #DisableRootJobs=NO
> > #EnforcePartLimits=NO
> > #Epilog=
> > #EpilogSlurmctld=
> > #FirstJobId=1
> > #MaxJobId=67043328
> > #GresTypes=
> > #GroupUpdateForce=0
> > #GroupUpdateTime=600
> > #JobFileAppend=0
> > #JobRequeue=1
> > #JobSubmitPlugins=lua
> > #KillOnBadExit=0
> > #LaunchType=launch/slurm
> > #Licenses=foo*4,bar
> > #MailProg=/bin/mail
> > #MaxJobCount=10000
> > #MaxStepCount=40000
> > #MaxTasksPerNode=512
> > MpiDefault=none
> > #MpiParams=ports=#-#
> > #PluginDir=
> > #PlugStackConfig=
> > #PrivateData=jobs
> > ProctrackType=proctrack/cgroup
> > #Prolog=
> > #PrologFlags=
> > #PrologSlurmctld=
> > #PropagatePrioProcess=0
> > #PropagateResourceLimits=
> > #PropagateResourceLimitsExcept=
> > #RebootProgram=
> > ReturnToService=1
> > SlurmctldPidFile=/var/run/slurmctld.pid
> > SlurmctldPort=6817
> > SlurmdPidFile=/var/run/slurmd.pid
> > SlurmdPort=6818
> > SlurmdSpoolDir=/var/spool/slurmd
> > SlurmUser=nousheen
> > #SlurmdUser=root
> > #SrunEpilog=
> > #SrunProlog=
> >
> StateSaveLocation=/home/nousheen/Documents/SILICS/slurm-21.08.5/slurmctld
> > SwitchType=switch/none
> > #TaskEpilog=
> > TaskPlugin=task/affinity
> > #TaskProlog=
> > #TopologyPlugin=topology/tree
> > #TmpFS=/tmp
> > #TrackWCKey=no
> > #TreeWidth=
> > #UnkillableStepProgram=
> > #UsePAM=0
> > #
> > #
> > # TIMERS
> > #BatchStartTimeout=10
> > #CompleteWait=0
> > #EpilogMsgTime=2000
> > #GetEnvTimeout=2
> > #HealthCheckInterval=0
> > #HealthCheckProgram=
> > InactiveLimit=0
> > KillWait=30
> > #MessageTimeout=10
> > #ResvOverRun=0
> > MinJobAge=300
> > #OverTimeLimit=0
> > SlurmctldTimeout=120
> > SlurmdTimeout=300
> > #UnkillableStepTimeout=60
> > #VSizeFactor=0
> > Waittime=0
> > #
> > #
> > # SCHEDULING
> > #DefMemPerCPU=0
> > #MaxMemPerCPU=0
> > #SchedulerTimeSlice=30
> > SchedulerType=sched/backfill
> > SelectType=select/cons_tres
> > SelectTypeParameters=CR_Core
> > #
> > #
> > # JOB PRIORITY
> > #PriorityFlags=
> > #PriorityType=priority/basic
> > #PriorityDecayHalfLife=
> > #PriorityCalcPeriod=
> > #PriorityFavorSmall=
> > #PriorityMaxAge=
> > #PriorityUsageResetPeriod=
> > #PriorityWeightAge=
> > #PriorityWeightFairshare=
> > #PriorityWeightJobSize=
> > #PriorityWeightPartition=
> > #PriorityWeightQOS=
> > #
> > #
> > # LOGGING AND ACCOUNTING
> > #AccountingStorageEnforce=0
> > #AccountingStorageHost=
> > #AccountingStoragePass=
> > #AccountingStoragePort=
> > AccountingStorageType=accounting_storage/none
> > #AccountingStorageUser=
> > #AccountingStoreFlags=
> > #JobCompHost=
> > #JobCompLoc=
> > #JobCompPass=
> > #JobCompPort=
> > JobCompType=jobcomp/none
> > #JobCompUser=
> > #JobContainerType=job_container/none
> > JobAcctGatherFrequency=30
> > JobAcctGatherType=jobacct_gather/none
> > SlurmctldDebug=info
> > SlurmctldLogFile=/var/log/slurmctld.log
> > SlurmdDebug=info
> > SlurmdLogFile=/var/log/slurmd.log
> > #SlurmSchedLogFile=
> > #SlurmSchedLogLevel=
> > #DebugFlags=
> > #
> > #
> > # POWER SAVE SUPPORT FOR IDLE NODES (optional)
> > #SuspendProgram=
> > #ResumeProgram=
> > #SuspendTimeout=
> > #ResumeTimeout=
> > #ResumeRate=
> > #SuspendExcNodes=
> > #SuspendExcParts=
> > #SuspendRate=
> > #SuspendTime=
> > #
> > #
> > # COMPUTE NODES
> > NodeName=linux[1-32] CPUs=11 State=UNKNOWN____
> >
> > PartitionName=debug Nodes=ALL Default=YES MaxTime=INFINITE
> > State=UP ____
> >
> > __ __
> >
> >
> > ____
> >
> > Best Regards,____
> >
> > Nousheen Parvaiz____
> >
> > ᐧ____
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220201/4bd24603/attachment-0001.htm>
More information about the slurm-users
mailing list