This ticket with SchedMD implies it's a munged issue:
https://urldefense.com/v3/__https://bugs.schedmd.com/show_bug.cgi?id=1293__;...
Is the munge daemon running on all systems? If it is, are all servers running a network time daemon such chronyd or ntpd and the time is in sync on all hosts? Thanks Mick,
munge is seemingly running on all systems (systemctl status munge). I do get a warning about the munge file changing on disk, but I'm pretty sure that's from warewulf sync'ing files every minute. A sha256sum on the munge.key file on the compute nodes and host node says they're the same, so I think I can put that aside.
The management node runs chrony and the compute nodes sync to the management node. [root@kirby uber]# chronyc tracking Reference ID : 4A06A849 (t2.time.gq1.yahoo.com) Stratum : 3 Ref time (UTC) : Mon Jan 08 22:26:44 2024 System time : 0.000032525 seconds slow of NTP time Last offset : -0.000021390 seconds RMS offset : 0.000055729 seconds Frequency : 38.797 ppm slow Residual freq : +0.001 ppm Skew : 0.018 ppm Root delay : 0.033342984 seconds Root dispersion : 0.000524800 seconds Update interval : 256.8 seconds Leap status : Normal
vs [root@sonic01 ~]# chronyc tracking Reference ID : C0A80102 (warewulf) Stratum : 4 Ref time (UTC) : Mon Jan 08 22:31:02 2024 System time : 0.000000120 seconds slow of NTP time Last offset : -0.000000092 seconds RMS offset : 0.000014737 seconds Frequency : 47.495 ppm slow Residual freq : +0.000 ppm Skew : 0.066 ppm Root delay : 0.033458963 seconds Root dispersion : 0.000283949 seconds Update interval : 64.2 seconds Leap status : Normal
So, the compute node is talking to the host and the host is talking to generic NTP sources. "date" shows the same time on the compute nodes