[slurm-users] Munge thinks clocks aren't synced

Gard Nelson Gard.Nelson at Immunitybio.com
Tue Oct 27 18:08:06 UTC 2020


Hi everyone,

I’m adding a new node to an existing cluster. After installing slurm and the prereqs, I synced the clocks with ntpd. When I run ‘ntpq -p’, I get 0.0 for delay, offset and jitter. (the slurm head node is also the ntp server) ‘date’ also gives me identical times for the head and compute nodes. However, when I start slurmd, I get a munge error about the clocks being out of sync. From the slurmctld log:

[2020-10-27T11:02:06.511] node NEW_NODE returned to service
[2020-10-27T11:02:07.265] error: Munge decode failed: Rewound credential
[2020-10-27T11:02:07.265] ENCODED: Tue Oct 27 11:09:45 2020
[2020-10-27T11:02:07.265] DECODED: Tue Oct 27 11:02:07 2020
[2020-10-27T11:02:07.265] error: Check for out of sync clocks
[2020-10-27T11:02:07.265] error: slurm_unpack_received_msg: MESSAGE_NODE_REGISTRATION_STATUS has authentication error: Rewound credential
[2020-10-27T11:02:07.265] error: slurm_unpack_received_msg: Protocol authentication error
[2020-10-27T11:02:07.275] error: slurm_receive_msg [HEAD_NODE_IP:PORT]: Unspecified error

I restarted ntp, munge and the slurm daemons on both nodes before this last error was generated. Any idea what’s going on here?

Thanks,
Gard
CONFIDENTIALITY NOTICE
This e-mail message and any attachments are only for the use of the intended recipient and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient, any disclosure, distribution or other use of this e-mail message or attachments is prohibited. If you have received this e-mail message in error, please delete and notify the sender immediately. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20201027/77d1e90d/attachment.htm>


More information about the slurm-users mailing list