[slurm-users] slurm/munge problem: invalid credentials
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Wed Dec 16 14:29:28 UTC 2020
Hi Olaf,
Since you are testing Slurm, perhape my Slurm Wiki page may be of interest
to you:
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation
There is a discussion about the setup of Munge.
Best regards,
Ole
On 12/15/20 5:48 PM, Olaf Gellert wrote:
> Hi all,
>
> we are setting up a new test cluster to test some features for our
> next HPC system. On one of the compute nodes we get these messages
> in the log:
>
> [2020-12-15T10:00:21.753] error: Munge decode failed: Invalid credential
> [2020-12-15T10:00:21.753] auth/munge: _print_cred: ENCODED: Thu Jan 01
> 01:00:00 1970
> [2020-12-15T10:00:21.753] auth/munge: _print_cred: DECODED: Thu Jan 01
> 01:00:00 1970
> [2020-12-15T10:00:21.753] error: slurm_receive_msg_and_forward:
> g_slurm_auth_verify: REQUEST_NODE_REGISTRATION_STATUS has authentication
> error: Invalid authentication credential
> [2020-12-15T10:00:21.753] error: slurm_receive_msg_and_forward: Protocol
> authentication error
> [2020-12-15T10:00:21.763] error: service_connection: slurm_receive_msg:
> Protocol authentication error
>
> I checked munge authentication in the usual way, so:
> - time between nodes is synchronised
> - munge is using same UID/GID on both sides
> - "munge -c0 -z0 -n | unmunge" works on compute nodes and on slurmctld
> node
> - ssh slurmcontrolnode "munge -c0 -z0 -n" | unmunge on a compute node
> works
> - ssh computenode "munge -c0 -z0 -n" | unmunge on the slurmctld node
> works
>
> So munge seems to work as far as I can say. What else does
> slurm using munge? Are hostnames part of the authentication?
> Do I have to wonder about the time "Thu Jan 01 01:00:00 1970"
> (in the logs above)?
>
> All machines are CentOS8, slurm is self-built 20.11.0,
> munge is from CentOS8 rpm:
>
> munge-0.5.13-1.el8.x86_64
> munge-libs-0.5.13-1.el8.x86_64
>
> Cheers, Olaf
>
More information about the slurm-users
mailing list