[slurm-users] slurm/munge problem: invalid credentials
Olaf Gellert
gellert at dkrz.de
Tue Dec 15 16:48:57 UTC 2020
Hi all,
we are setting up a new test cluster to test some features for our
next HPC system. On one of the compute nodes we get these messages
in the log:
[2020-12-15T10:00:21.753] error: Munge decode failed: Invalid credential
[2020-12-15T10:00:21.753] auth/munge: _print_cred: ENCODED: Thu Jan 01
01:00:00 1970
[2020-12-15T10:00:21.753] auth/munge: _print_cred: DECODED: Thu Jan 01
01:00:00 1970
[2020-12-15T10:00:21.753] error: slurm_receive_msg_and_forward:
g_slurm_auth_verify: REQUEST_NODE_REGISTRATION_STATUS has authentication
error: Invalid authentication credential
[2020-12-15T10:00:21.753] error: slurm_receive_msg_and_forward: Protocol
authentication error
[2020-12-15T10:00:21.763] error: service_connection: slurm_receive_msg:
Protocol authentication error
I checked munge authentication in the usual way, so:
- time between nodes is synchronised
- munge is using same UID/GID on both sides
- "munge -c0 -z0 -n | unmunge" works on compute nodes and on slurmctld
node
- ssh slurmcontrolnode "munge -c0 -z0 -n" | unmunge on a compute node
works
- ssh computenode "munge -c0 -z0 -n" | unmunge on the slurmctld node
works
So munge seems to work as far as I can say. What else does
slurm using munge? Are hostnames part of the authentication?
Do I have to wonder about the time "Thu Jan 01 01:00:00 1970"
(in the logs above)?
All machines are CentOS8, slurm is self-built 20.11.0,
munge is from CentOS8 rpm:
munge-0.5.13-1.el8.x86_64
munge-libs-0.5.13-1.el8.x86_64
Cheers, Olaf
--
Dipl. Inform. Olaf Gellert email gellert at dkrz.de
Deutsches Klimarechenzentrum GmbH phone +49 (0)40 460094 214
Bundesstrasse 45a fax +49 (0)40 460094 270
D-20146 Hamburg, Germany www http://www.dkrz.de
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Prof. Dr. Thomas Ludwig
Registergericht: Amtsgericht Hamburg, HRB 39784
More information about the slurm-users
mailing list