[slurm-users] How to delay the start of slurmd until Infiniband/OPA network is fully up?

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Tue Oct 31 09:59:56 UTC 2023


Hi Jeffrey,

On 10/30/23 20:15, Jeffrey R. Lang wrote:
> The service is available in RHEL 8 via the EPEL package repository as system-networkd, i.e. systemd-networkd.x86_64                                           253.4-1.el8    epel

Thanks for the info.  We can install the systemd-networkd RPM from the 
EPEL repo as you suggest.

I tried to understand the properties of systemd-networkd before 
implementing it in our compute nodes.  While there are lots of networkd 
man-pages, it's harder to find an overview of the actual properties of 
networkd.  This is what I found:

* Networkd is a service included in recent versions of Systemd.  It seems 
to be an alternative to NetworkManager.

* Red Hat has stated that systemd-networkd is NOT going to be implemented 
in RHEL 8 or 9.

* Comparing systemd-networkd and NetworkManager: 
https://fedoracloud.readthedocs.io/en/latest/networkd.html

* Networkd is described in the Wikipedia article 
https://en.wikipedia.org/wiki/Systemd

While networkd seems to be really nifty, I hesitate to replace 
NetworkManager by networkd on our EL8 and EL9 systems because this is an 
unsupported and only lightly tested setup, and it may require additional 
work to keep our systems up-to-date in the future.

It seems to me that Max Rutkowski's solution in 
https://github.com/maxlxl/network.target_wait-for-interfaces is less 
intrusive than converting to systemd-networkd.

Best regards,
Ole


> -----Original Message-----
> From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Ole Holm Nielsen
> Sent: Monday, October 30, 2023 1:56 PM
> To: slurm-users at lists.schedmd.com
> Subject: Re: [slurm-users] How to delay the start of slurmd until Infiniband/OPA network is fully up?
> 
> ◆ This message was sent from a non-UWYO address. Please exercise caution when clicking links or opening attachments from external sources.
> 
> 
> Hi Jens,
> 
> Thanks for your feedback:
> 
> On 30-10-2023 15:52, Jens Elkner wrote:
>> Actually there is no need for such a script since
>> /lib/systemd/systemd-networkd-wait-online should be able to handle it.
> 
> It seems that systemd-networkd exists in Fedora FC38 Linux, but not in
> RHEL 8 and clones, AFAICT.



More information about the slurm-users mailing list