[slurm-users] Slurm on Debian Stretch

Martijn Kruiten martijn.kruiten at surfsara.nl
Thu Mar 5 11:03:38 UTC 2020


Hi Steffen,

We are using Slurm on Debian Stretch at SURFsara on our LISA cluster.
We've been using the Debian Slurm (
https://salsa.debian.org/hpc-team/slurm-wlm) with a couple of patches,
although we're looking into a different option now. 

Anyway, the daemons probably won't start because they're looking for
the PID files in the wrong locations. Take a look at SlurmctldPidFile
and SlurmdPidFile in slurm.conf and see if they match the systemd
service files. 

I've not seen "scontrol reconfig" killing slurmctld, so I can't help
you there. Did you put the verbosity on a debug level and see if it
says anything before being killed?

The munge part I don't know from memory how we manage that.

Regards,
Martijn

On Wed, 2020-03-04 at 08:54 +0100, Steffen Grunewald wrote:
> Good morning,
> 
> is there anyone out there, running Slurm on a Debian Stretch
> platform?
> I've been maintaining a HTCondor pool for quite some time, and
> recently
> started an attempt to convert some of the compute nodes to form a
> Slurm
> cluster instead.
> 
> I ran into some issues I could only partially resolve yet:
> - magic UID for "slurm" user, but none for "munge" (and since the
> munge
>   key has to be shared, unique UIDs are essential)
> - daemons don't start (timeout) when using "service ... start",
> running
>   with -D and backgrounding doesn't show anything weird
> - "scontrol reconfig" tends to kill the slurmctld
> 
> Upgrading to Buster isn't an option yet, and I doubt the issues would
> vaporize by upgrading.
> 
> Any suggestions?
> 
> Thanks,
> - S
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5036 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200305/4efd3825/attachment-0001.bin>


More information about the slurm-users mailing list