[slurm-users] Cluster not booting after upgrade to debian jessie
oliva.g at na.icar.cnr.it
Tue Jan 9 08:10:31 MST 2018
On Tue, Jan 09, 2018 at 01:40:19PM +0100, Elisabetta Falivene wrote:
> The new kernel was installed during an upgrade from Debian 7 Wheezy to
> Debian 8 Jessie. The upgrade went ok on the 8 nodes of the cluster, but not
> on the master. Btw, on the nodes kernel 3.16 is working ok.
You may have some special storage on the front-end that is not
recognized by the new kernel. I think you'll get a better help on a
Debian related mailing list like debian-user 
> Stupid question: It's worth trying to make the new kernel work, in your
> opinion? If, in the worst case, I have to keep the 3.2 kernel on the master
> is so bad?
You need 3.16 with Jessie.
> > On 9 January 2018 at 13:16, Elisabetta Falivene <e.falivene at ilabroma.com>
> > wrote:
> >> First time after reboot launching sinfo:
> >> *sinfo: error: If munged is up, restart with —numthreads=10*
> >> *sinfo: error: Munge encode failed: Failed to access
> >> /var/run/munge/munge.socket2”: No such file or directory*
please check your munge installation:
Is munge installed?
dpkg -l munge
if not, install munge apt-get install munge
Is the munge key in place?
ls -la /etc/munge/munge.key
if it is not use create-munge-key and copy the key on all the nodes.
Is the munge daemon enabled?
systemctl is-enabled munge
if not use systemctl enable munge to enable and start it.
Is the munge daemon started?
systemctl status munge
If not, try to start it with systemctl start munge and check the error
More information about the slurm-users