[slurm-users] How to launch slurm services after installation

Kamil Wilczek kmwil at mimuw.edu.pl
Mon Nov 28 08:49:42 UTC 2022


Hello,

all supported build flags are available with "./configure --help"
command. On of them is "--with-systemdsystemunitdir=DIR", which
will allow you to specify the directory for the systemd service
files for all Slurm daemons. The most important of the flags is imho
the "--prefix", which sets the installation directory.

I'll describe my build setup shortly (sorry for the length),
it might be helpful to someone who is just starting -- I remember when
I was trying to setup this for the first time, it was hell ;)

Of course there are multiple approaches and this is the only one of
them. And mine is probably not too well designed and optimal ;),
but after several years of using Ubuntu's builds I tried this
approach and it works quite well.

I you are building from source, consider using Ansible or any other
automation tool; the whole process becomes much easier and easily
repeatable, I highly recommend this. Otherwise it is a world of pain ;) 
and prone to errors to make all the changes manually.

Try setting the "--prefix" to /opt/slurm_version_build_version".
This way you can try to build with different options many times and it
will be easy to test and delete old/bad versions. When you decide that
the outcome is what you want, you can set the "production" prefix to
"/opt/slurm_version". I think it is also the common approach advised in
the official docs:
https://slurm.schedmd.com/quickstart_admin.html#upgrade

This way you will have a separate binaries for each of the Slurm version
and in case of problems with the new build you can always return to the
previous one by symlinking the currently used version to "/opt/slurm".
Slurm is designed to not introduce breaking changes between at least one
major version if I remember correctly, so changing between versions
should work without problems.
I also set separate state and log directories.

# ls -l /opt

...
root          root           18 Nov  3 11:07 slurm -> /opt/slurm_22.05.5
root          root           94 Aug 12 11:31 slurm_22.05.2
root          root           94 Nov  3 11:05 slurm_22.05.5
slurm         slurm          39 Aug 12 11:36 slurm_log_dir
slurm         slurm          24 Aug 12 11:36 slurm_state_dir
...

All the systemd's service file should use "/opt/slurm/..." paths
in this case. And each build should have separate config files.
This is a bit complicated at first and requires solving several
management problems, but after some time I think it allows for easier
upgrades.

Kind regards
-- 
Kamil Wilczek [https://keys.openpgp.org/]
[6C4BE20A90A1DBFB3CBE2947A832BF5A491F9F2A]

W dniu 28.11.2022 o 04:58, Brian Andrus pisze:
> Steve,
> 
> 
> I suspect you did not install the packages.
> 
> 
> You need to install slurm-slurmctld to get the slurmctld systemd files:
> 
>     /# rpm -qlp slurm-slurmctld-20.11.9-1.el7.x86_64.rpm//
>     ///run/slurm/slurmctld.pid//
>     /*//usr/lib/systemd/system/slurmctld.service/*/
>     ///usr/sbin/slurmctld//
>     ///usr/share/man/man8/slurmctld.8.gz//
>     /
> 
> 
> The same for slurm-slurmdbd. Both of those are management daemons and 
> should only be running on one (two if you configure failover) systems.
> 
> Your compute nodes need slurm-slurmd, which will provide the systemd 
> files for slurmd.
> 
> 
> 
> On 11/27/2022 7:34 PM, 刘 博涵 wrote:
>> Hi all,
>>
>> I'm a newcomer to cluster computing and have been trying to setup a 
>> Slurm cluster myself. Right now I'm stuck at starting up Slurm's 
>> systemd services. I checked out the following tutorials:
>>
>>  1. Slurm Workload Manager - Quick Start Administrator Guide
>>     (schedmd.com) <https://slurm.schedmd.com/quickstart_admin.html>
>>  2. https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_installation/
>>  3. https://wiki.bkslab.org/index.php/Slurm_Installation_Guide
>>  4. Slurm installation (southgreenplatform.github.io)
>>     <https://southgreenplatform.github.io/trainings/hpc/slurminstallation/>
>>
>> All of them state that I should run /systemctl enable/start 
>> slurmd/slurmdbd/slurmctld/ after installation, however they always 
>> fail because the corresponding systemd config files do not exist, 
>> regardless of whether I installed Slurm from source or from EPEL 
>> repos. All my systems are CentOS 7.9 with the latest updates prior to 
>> Slurm installation, and I was trying to install Slurm 22.05.6 from 
>> source. My question is are the systemd config files actually created 
>> during installation process as the tutorials imply, or do I have to 
>> write them myself? If the latter, then how should I write my slurm 
>> systemd config file (what parameters should I put in etc.), any 
>> templates I can follow?
>>
>> Many thanks,
>>
>> Steve
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 236 bytes
Desc: OpenPGP digital signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20221128/19b2befb/attachment-0001.sig>


More information about the slurm-users mailing list