[slurm-users] Slurm Daemons not starting

Avery Grieve agrieve at umich.edu
Fri Dec 11 22:47:27 UTC 2020


Hi Forum,

First, cluster info:
Debian Buster (armbian) on arm64 architecture -- really nothing too fancy
going on here.

I've built slurm from source with PMIX with the following steps:


   - Configure openmpi with slurm and internal pmix -- works as intended
   with mpirun/mpiexec
   - install debian package "libpmix-dev"
   - Configure slurm pointing at pmix dev library, and munge.


I copied the ".service" files from the build directory into my
/etc/systemd/system/. I modified these files to point to the right
configure file and such. The only thing I'm not sure about is where to
point the EnvironmentFile in that unit file.

Now, here's the question: I can run
# systemctl enable slurmctld
No errors are printed and systemctl status says it's just inactive
I then start the service with
# systemctl start slurmctld
No errors are printed and the controller can successfully communicate with
the compute nodes

Now on restart, the service does not automatically start because of the
following error:
slurmctld.service - Slurm controller daemon
Loaded: loaded (/etc/systemd/system/slurmctld.service; enabled; vendor
preset: enabled)
Active: failed (Result: exit-code) since Fri 2020-12-11 21:32:47 GMT; 8min
ago
Process: 455 ExecStart=/usr/local/sbin/slurmctld -D $SLURMCTLD_OPTIONS
(code=exited, status=1/FAILURE)
Main PID: 455 (code=exited, status=1/FAILURE)

Dec 11 21:32:47 ApacheHead systemd[1]: Started Slurm controller daemon.
Dec 11 21:32:47 ApacheHead systemd[1]: slurmctld.service: Main process
exited, code=exited, status=1/FAILURE
Dec 11 21:32:47 ApacheHead systemd[1]: slurmctld.service: Failed with
result 'exit-code'.

Despite these errors, I can still start the service with the systemctl
start command. Also, running the actual slurmctld command found in sbin
runs correctly with no critical errors.

I've tried to look into this, but can't seem to find too much on this
problem for slurm or for system processes in general.

Any ideas?

Thanks,

~Avery Grieve
They/Them/Theirs please!
University of Michigan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20201211/a6c82116/attachment.htm>


More information about the slurm-users mailing list