[slurm-users] Slurm on Debian Stretch
steffen.grunewald at aei.mpg.de
Mon May 11 14:36:26 UTC 2020
I'm sorry that it took me several weeks to get back to this issue
- never fix anything that isn't broken (... too much), and I've
been busy with user accounting all over the place...
On Thu, 2020-03-05 at 12:03:38 +0100, Martijn Kruiten wrote:
> Hi Steffen,
> We are using Slurm on Debian Stretch at SURFsara on our LISA cluster.
> We've been using the Debian Slurm (
> https://salsa.debian.org/hpc-team/slurm-wlm) with a couple of patches,
> although we're looking into a different option now.
I was confused by that URL, but I checked and I'm sure I've used the
right Slurm by installing slurmd, slurmdbd, slurmctld.
> Anyway, the daemons probably won't start because they're looking for
> the PID files in the wrong locations. Take a look at SlurmctldPidFile
> and SlurmdPidFile in slurm.conf and see if they match the systemd
> service files.
It's a bit more complicated - and here's why I'd like to learn more
about your patches.
We're hosting /etc/slurm on an NFS share, and mount it using autofs.
When the machine comes up, the slurm*.service only checks for
"After=network.target", but that's too early - as autofs.service
hasn't even started yet.
Since slurm*.services depend on /etc/slurm-llnl/slurm.conf existing
(and even if that's a pending symlink, that won't count - apparently
the file must be readable), the daemons won't come up in the first
Also I haven't found who's in charge of creating all those .../slurm-llnl
subdirectories - again, this might be an ordering problem, which you
may have solved yet.
In the meantime I have found that root can create new associations in
the DB with sacctmgr, but it would need a restart of the slurmctld to
accept new users. I suspect that this has to do with SlurmUser,
StorageUser and AccountingStorageUser all set to "slurm" (this works
on the CentOS-7 HPC cluster next door) - do you have any advice?
Steffen Grunewald, Cluster Administrator
Max Planck Institute for Gravitational Physics (Albert Einstein Institute)
Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany
Fon: +49-331-567 7274
More information about the slurm-users