[slurm-users] Slurm on Debian Stretch
martijn.kruiten at surfsara.nl
Thu Mar 5 11:03:38 UTC 2020
We are using Slurm on Debian Stretch at SURFsara on our LISA cluster.
We've been using the Debian Slurm (
https://salsa.debian.org/hpc-team/slurm-wlm) with a couple of patches,
although we're looking into a different option now.
Anyway, the daemons probably won't start because they're looking for
the PID files in the wrong locations. Take a look at SlurmctldPidFile
and SlurmdPidFile in slurm.conf and see if they match the systemd
I've not seen "scontrol reconfig" killing slurmctld, so I can't help
you there. Did you put the verbosity on a debug level and see if it
says anything before being killed?
The munge part I don't know from memory how we manage that.
On Wed, 2020-03-04 at 08:54 +0100, Steffen Grunewald wrote:
> Good morning,
> is there anyone out there, running Slurm on a Debian Stretch
> I've been maintaining a HTCondor pool for quite some time, and
> started an attempt to convert some of the compute nodes to form a
> cluster instead.
> I ran into some issues I could only partially resolve yet:
> - magic UID for "slurm" user, but none for "munge" (and since the
> key has to be shared, unique UIDs are essential)
> - daemons don't start (timeout) when using "service ... start",
> with -D and backgrounding doesn't show anything weird
> - "scontrol reconfig" tends to kill the slurmctld
> Upgrading to Buster isn't an option yet, and I doubt the issues would
> vaporize by upgrading.
> Any suggestions?
> - S
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 5036 bytes
Desc: not available
More information about the slurm-users