[slurm-users] After Each slurm Run, I Need to Reinstall slurm
linux-ken at comcast.net
Sat May 5 12:44:58 MDT 2018
I am a new slurm user and am trying to set up a single node test system. I have spent endless hours trying to get slurm services to start. I am running Ubuntu Server V16.04 and slurm 17.11.5. My MB has an AMD 8 core processor. When I try to start slurmdbd or slurmctld services I get messages saying can't access shared libraries or pid files missing. At times, I noticed that the pid files in /var/run have been deleted. I have made copies of the pid files and copy them back to /var/run when they are missing.
I have found that if I reinstall slurm from the tarball, the services will start. To speed things up, I have created a bash script to reinstall slurm, starting with the tarball extraction step. This is a very inefficient work-around.
Can anyone help me solve the problem of why slurm runs only once and then fails on subsequent starts?
I can send copies of conf and log files if requested.
Thanks, in advance.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users