[slurm-users] Cluster not booting after upgrade to debian jessie

Elisabetta Falivene e.falivene at ilabroma.com
Tue Jan 9 08:58:22 MST 2018


> Ciao Elisabetta,
>

Ciao Gennaro! :)


>
> On Tue, Jan 09, 2018 at 01:40:19PM +0100, Elisabetta Falivene wrote:
> > The new kernel was installed during an upgrade from Debian 7 Wheezy to
> > Debian 8 Jessie. The upgrade went ok on the 8 nodes of the cluster, but
> not
> > on the master. Btw, on the nodes kernel 3.16 is working ok.
>
> You may have some special storage on the front-end that is not
> recognized by the new kernel. I think you'll get a better help on a
> Debian related mailing list like debian-user [1]
>

Good! I'll try


>
> > > On 9 January 2018 at 13:16, Elisabetta Falivene <
> e.falivene at ilabroma.com>
> > > wrote:
> > >> First time after reboot launching sinfo:
> > >>
> > >> *sinfo: error: If munged is up, restart with —numthreads=10*
> > >>
> > >> *sinfo: error: Munge encode failed: Failed to access
> > >> /var/run/munge/munge.socket2”: No such file or directory*
>
> please check your munge installation:
>
>
I did all the check you suggested and found out that Munge was disabled!
Maybe was disabled in wheezy and not needed by the old 2.3.4 slurm? Or the
update could have disabled it?
I tried to restart slurm (/etc/init.d/slurmd restart is enough?) after
restarting munge but it doesn't seems to start correctly. I'll try to
reboot as soon as I have the machine under my hands.

Thank you very much,
Elisabetta
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180109/f6eed7f0/attachment-0001.html>


More information about the slurm-users mailing list