[slurm-users] Munge decode failing on new node

Dean Schulze dean.w.schulze at gmail.com
Thu Apr 23 16:37:32 UTC 2020


I went through the exercise of making the other user the same on the
slurmctld as on the slurmd nodes, but that had no effect.  I still have 3
nodes that have connectivity and one node where slurmd cannot contact
slurmctld.  That node has ssh connectivity to and from slurmctld node, but
no slurm communication.

It's time to reformat the drive and start over.


On Thu, Apr 23, 2020 at 12:34 AM Gennaro Oliva <oliva.g at na.icar.cnr.it>
wrote:

> Hi Dean,
>
> On Wed, Apr 22, 2020 at 07:28:15PM -0600, dean.w.schulze at gmail.com wrote:
> > Even for users other than slurm and munge?  It seems strange that 3 of
> > 4 worker nodes work with the same UIDs/GIDs as the non-working nodes.
>
> As in:
>
> https://slurm.schedmd.com/quickstart_admin.html
>
> Super Quick Start 1st step:
>
> Make sure the clocks, users and groups (UIDs and GIDs) are synchronized
> across the cluster.
>
> This is true for the slum user and the regular users running jobs.
>
> The munge user doesn't need to be the same on all the cluster:
>
> https://bugs.schedmd.com/show_bug.cgi?id=4209
>
> Best regards,
> --
> Gennaro Oliva
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200423/37c3d6c6/attachment.htm>


More information about the slurm-users mailing list