[slurm-users] Extreme long db upgrade 16.05.6 -> 17.11.3

Peter Kjellström cap at nsc.liu.se
Thu Mar 1 03:47:31 MST 2018


On Wed, 28 Feb 2018 06:51:15 +1100
Chris Samuel <chris at csamuel.org> wrote:

> On Wednesday, 28 February 2018 2:13:41 AM AEDT Miguel Gila wrote:
> 
> > Microcode patches were not applied to the physical system, only the
> > kernel was upgraded, so I'm not sure whether the performance hit
> > could come from that or not.  
> 
> Yes it would, it's the kernel changes that cause the impact.  My
> understanding is that the microcode update had features that were
> intended to mitigate that.

Yes and no.

The kernel has page table isolation (ie meltdown protection) regardless
of microcode level.

The microcode was half of the fix for most of spectre (together with
other kernel patches). If the microcode is unavailable these kernel
patches will not be used/activate. Look for ibrs and ibpb
in /sys/kernel/deubg/x86.

The latter part is somewhat redhat specific.

In our tests (before the microcode was reverted and ibrs/ibpb disabled)
this caused more performance impact than the page table isolation
(YMMV).

> Also note Intel later withdrew the microcode update due to
> instability on earlier CPUs (Linux distros reverted their firmware
> updates at that time):
> 
> https://newsroom.intel.com/news/root-cause-of-reboot-issue-identified-updated-guidance-for-customers-and-partners/
> 
> and it appears the most recent update is intended to be pushed out
> via firmware updates rather than a microcode file loaded from the OS.

Or manual downloads for admin -> micrcodectl load in linux.

/Peter



More information about the slurm-users mailing list