lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 7 Dec 2015 23:34:27 +0100
From:	Borislav Petkov <bp@...en8.de>
To:	"Luck, Tony" <tony.luck@...el.com>
Cc:	"Raj, Ashok" <ashok.raj@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>
Subject: Re: [Patch V2] x86, mce: Ensure offline CPU's don't participate in
 mce rendezvous process.

On Mon, Dec 07, 2015 at 10:07:59PM +0000, Luck, Tony wrote:
> > And that is incorrect too, because the MCE (at least the one I'm
> > injecting) gets broadcasted to the CPUs on the *node* and not to the
> > whole system.
> 
> Which system?  What kind of machine check?  On Intel we expect machine checks
> to be broadcast to all logical cpus on all nodes (unless local machine check is enabled,
> in which case SRAR style machine checks go only to the logical cpu that hit the error).
> 
> The code is written to that expectation ... and we don't report things as well if
> something else happens (like too many or too few cpus showing up).

Box logs below.

BIOS is doing funny cores enumeration:

node #0, CPUs 0-7
node #1, CPUs 8-15
node #2, CPUs 16-23
node #3, CPUs 24-31

and then starts from node 0 again:

 .... node  #0, CPUs:    #32  #33  #34  #35  #36  #37  #38  #39
 .... node  #1, CPUs:    #40  #41  #42  #43  #44  #45  #46  #47
 .... node  #2, CPUs:    #48  #49  #50  #51  #52  #53  #54  #55
 .... node  #3, CPUs:    #56  #57  #58  #59  #60  #61  #62  #63

So I went and offlined cores 5 and 34 which are on node 0.

Why node 0? Well, when I inject error type 0x10 which is

0x00000010      Memory Uncorrectable non-fatal

it generates an MCE only on the node 0 cores. For that log see the end
of this mail. The gist of it is that the CPUs on which #MC gets raised
are the cores on node 0, i.e., 0-7 and 32-39.

Cores 5 and 34 are gone, of course.

I mean, even if the #MC gets raised only on the node, the fix still
works.

$ grep -Ei "hardware.*CPU" /tmp/mce | sed 's/^.*CPU//' | sort -n
 0: Machine Check Exception: 5 Bank 5: be00000000010090
 1: Machine Check Exception: 5 Bank 5: be00000000010090
 2: Machine Check Exception: 5 Bank 5: be00000000010090
 3: Machine Check Exception: 5 Bank 5: be00000000010090
 4: Machine Check Exception: 5 Bank 5: be00000000010090
 6: Machine Check Exception: 5 Bank 5: be00000000010090
 7: Machine Check Exception: 5 Bank 5: be00000000010090
 32: Machine Check Exception: 5 Bank 5: be00000000010090
 33: Machine Check Exception: 5 Bank 5: be00000000010090
 35: Machine Check Exception: 5 Bank 5: be00000000010090
 36: Machine Check Exception: 5 Bank 5: be00000000010090
 37: Machine Check Exception: 5 Bank 5: be00000000010090
 38: Machine Check Exception: 5 Bank 5: be00000000010090
 39: Machine Check Exception: 5 Bank 5: be00000000010090




[    0.859060] smpboot: CPU0: Intel(R) Xeon(R) CPU E5-4650 0 @ 2.70GHz (family: 0x6, model: 0x2d, stepping: 0x7
...
[    0.981593] x86: Booting SMP configuration:
[    0.991092] .... node  #0, CPUs:          #1
[    1.013485] microcode: CPU1 microcode updated early to revision 0x710, date = 2013-06-17
[    1.034219]    #2
[    1.049577] microcode: CPU2 microcode updated early to revision 0x710, date = 2013-06-17
[    1.070309]    #3
[    1.085865] microcode: CPU3 microcode updated early to revision 0x710, date = 2013-06-17
[    1.106618]    #4
[    1.121978] microcode: CPU4 microcode updated early to revision 0x710, date = 2013-06-17
[    1.142720]    #5
[    1.158079] microcode: CPU5 microcode updated early to revision 0x710, date = 2013-06-17
[    1.178833]    #6
[    1.194191] microcode: CPU6 microcode updated early to revision 0x710, date = 2013-06-17
[    1.214914]    #7
[    1.230471] microcode: CPU7 microcode updated early to revision 0x710, date = 2013-06-17
[    1.251309] 
[    1.254854] .... node  #1, CPUs:     #8
[    1.275173] microcode: CPU8 microcode updated early to revision 0x710, date = 2013-06-17
[    1.390509]    #9
[    1.406859] microcode: CPU9 microcode updated early to revision 0x710, date = 2013-06-17
[    1.427735]   #10
[    1.444303] microcode: CPU10 microcode updated early to revision 0x710, date = 2013-06-17
[    1.465343]   #11
[    1.481718] microcode: CPU11 microcode updated early to revision 0x710, date = 2013-06-17
[    1.502779]   #12
[    1.519156] microcode: CPU12 microcode updated early to revision 0x710, date = 2013-06-17
[    1.540171]   #13
[    1.556536] microcode: CPU13 microcode updated early to revision 0x710, date = 2013-06-17
[    1.577587]   #14
[    1.594127] microcode: CPU14 microcode updated early to revision 0x710, date = 2013-06-17
[    1.615131]   #15
[    1.631471] microcode: CPU15 microcode updated early to revision 0x710, date = 2013-06-17
[    1.652590] 
[    1.656132] .... node  #2, CPUs:    #16
[    1.676518] microcode: CPU16 microcode updated early to revision 0x710, date = 2013-06-17
[    1.791812]   #17
[    1.808189] microcode: CPU17 microcode updated early to revision 0x710, date = 2013-06-17
[    1.829292]   #18
[    1.845868] microcode: CPU18 microcode updated early to revision 0x710, date = 2013-06-17
[    1.866925]   #19
[    1.883311] microcode: CPU19 microcode updated early to revision 0x710, date = 2013-06-17
[    1.904386]   #20
[    1.920765] microcode: CPU20 microcode updated early to revision 0x710, date = 2013-06-17
[    1.941810]   #21
[    1.958169] microcode: CPU21 microcode updated early to revision 0x710, date = 2013-06-17
[    1.979242]   #22
[    1.995787] microcode: CPU22 microcode updated early to revision 0x710, date = 2013-06-17
[    2.016842]   #23
[    2.033182] microcode: CPU23 microcode updated early to revision 0x710, date = 2013-06-17
[    2.054314] 
[    2.057854] .... node  #3, CPUs:    #24
[    2.078330] microcode: CPU24 microcode updated early to revision 0x710, date = 2013-06-17
[    2.193513]   #25
[    2.209874] microcode: CPU25 microcode updated early to revision 0x710, date = 2013-06-17
[    2.230996]   #26
[    2.247563] microcode: CPU26 microcode updated early to revision 0x710, date = 2013-06-17
[    2.268627]   #27
[    2.284998] microcode: CPU27 microcode updated early to revision 0x710, date = 2013-06-17
[    2.306061]   #28
[    2.322437] microcode: CPU28 microcode updated early to revision 0x710, date = 2013-06-17
[    2.343433]   #29
[    2.359780] microcode: CPU29 microcode updated early to revision 0x710, date = 2013-06-17
[    2.380855]   #30
[    2.397397] microcode: CPU30 microcode updated early to revision 0x710, date = 2013-06-17
[    2.418432]   #31
[    2.434759] microcode: CPU31 microcode updated early to revision 0x710, date = 2013-06-17
[    2.455792] 
[    2.459336] .... node  #0, CPUs:    #32  #33  #34  #35  #36  #37  #38  #39
[    2.583817] .... node  #1, CPUs:    #40  #41  #42  #43  #44  #45  #46  #47
[    2.710873] .... node  #2, CPUs:    #48  #49  #50  #51  #52  #53  #54  #55
[    2.838069] .... node  #3, CPUs:    #56  #57  #58  #59  #60  #61  #62  #63
[    2.964288] x86: Booted up 4 nodes, 64 CPUs
[    2.974471] smpboot: Total of 64 processors activated (344907.86 BogoMIPS)


[ 5290.635126] Broke affinity for irq 82
[ 5290.643222] Broke affinity for irq 111
[ 5290.651507] Broke affinity for irq 125
[ 5290.664107] smpboot: CPU 5 is now offline
[ 5298.371336] Broke affinity for irq 31
[ 5298.379528] Broke affinity for irq 82
[ 5298.387627] Broke affinity for irq 103
[ 5298.395908] Broke affinity for irq 110
[ 5298.404187] Broke affinity for irq 111
[ 5298.412450] Broke affinity for irq 112
[ 5298.420733] Broke affinity for irq 118
[ 5298.429017] Broke affinity for irq 124
[ 5298.437295] Broke affinity for irq 125
[ 5298.445584] Broke affinity for irq 127
[ 5298.453880] Broke affinity for irq 137
[ 5298.466543] smpboot: CPU 34 is now offline
[ 5302.187338] EINJ: Error INJection is initialized.
[ 5318.897170] Disabling lock debugging due to kernel taint
[ 5318.910775] mce: [Hardware Error]: CPU 37: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5318.931171] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8135de7f> {intel_idle+0xbf/0x130}
[ 5318.951567] mce: [Hardware Error]: TSC bab9f2d8a4e00 ADDR bb68ec00 MISC 20403ebe86 
[ 5318.969835] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC b microcode 710
[ 5318.990959] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 5319.003825] EDAC sbridge MC0: CPU 37: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5319.023215] EDAC sbridge MC0: TSC bab9f2d8a4e00 
[ 5319.033036] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86 
[ 5319.050338] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC b
[ 5319.069542] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset
:0xc00 grain:32 -  area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
[ 5319.122943] mce: [Hardware Error]: CPU 3: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5319.143355] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8135de7f> {intel_idle+0xbf/0x130}
[ 5319.163846] mce: [Hardware Error]: TSC bab9f2d8a51c1 ADDR bb68ec00 MISC 20403ebe86 
[ 5319.182249] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 6 microcode 710
[ 5319.203539] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 5319.216586] EDAC sbridge MC0: CPU 3: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5319.235994] EDAC sbridge MC0: TSC bab9f2d8a51c1 
[ 5319.245814] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86 
[ 5319.263348] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 6
[ 5319.283041] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset
:0xc00 grain:32 -  area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
[ 5319.337311] mce: [Hardware Error]: CPU 2: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5319.357960] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8159a4d0> {mutex_lock+0x10/0x27}
[ 5319.378519] mce: [Hardware Error]: TSC bab9f2d8a3feb ADDR bb68ec00 MISC 20403ebe86 
[ 5319.397151] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 4 microcode 710
[ 5319.418650] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 5319.431902] EDAC sbridge MC0: CPU 2: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5319.451491] EDAC sbridge MC0: TSC bab9f2d8a3feb 
[ 5319.461311] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86 
[ 5319.479022] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 4
[ 5319.499014] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset
:0xc00 grain:32 -  area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
[ 5319.553209] mce: [Hardware Error]: CPU 6: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5319.574029] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8135de7f> {intel_idle+0xbf/0x130}
[ 5319.594953] mce: [Hardware Error]: TSC bab9f2d8a87ea ADDR bb68ec00 MISC 20403ebe86 
[ 5319.613756] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC c microcode 710
[ 5319.635431] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 5319.648873] EDAC sbridge MC0: CPU 6: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5319.668661] EDAC sbridge MC0: TSC bab9f2d8a87ea 
[ 5319.678483] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86 
[ 5319.696422] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC c
[ 5319.716789] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset
:0xc00 grain:32 -  area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
[ 5319.771531] mce: [Hardware Error]: CPU 38: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5319.792743] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8135de7f> {intel_idle+0xbf/0x130}
[ 5319.813836] mce: [Hardware Error]: TSC bab9f2d8a87ce ADDR bb68ec00 MISC 20403ebe86 
[ 5319.832819] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC d microcode 710
[ 5319.854654] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 5319.868243] EDAC sbridge MC0: CPU 38: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5319.888366] EDAC sbridge MC0: TSC bab9f2d8a87ce 
[ 5319.898186] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86 
[ 5319.916192] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC d
[ 5319.936752] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset:0xc00 grain:32 -  area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
[ 5319.991752] mce: [Hardware Error]: CPU 35: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5320.013034] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8135de7f> {intel_idle+0xbf/0x130}
[ 5320.034166] mce: [Hardware Error]: TSC bab9f2d8a59dd ADDR bb68ec00 MISC 20403ebe86 
[ 5320.053149] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 7 microcode 710
[ 5320.074972] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 5320.088567] EDAC sbridge MC0: CPU 35: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5320.108688] EDAC sbridge MC0: TSC bab9f2d8a59dd 
[ 5320.118511] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86 
[ 5320.136527] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 7
[ 5320.157079] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset:0xc00 grain:32 -  area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
[ 5320.212025] mce: [Hardware Error]: CPU 39: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5320.233316] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8135de7f> {intel_idle+0xbf/0x130}
[ 5320.254462] mce: [Hardware Error]: TSC bab9f2d8a4f5c ADDR bb68ec00 MISC 20403ebe86 
[ 5320.273455] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC f microcode 710
[ 5320.295303] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 5320.308905] EDAC sbridge MC0: CPU 39: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5320.329026] EDAC sbridge MC0: TSC bab9f2d8a4f5c 
[ 5320.338847] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86 
[ 5320.356858] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC f
[ 5320.377433] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset:0xc00 grain:32 -  area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
[ 5320.432474] mce: [Hardware Error]: CPU 7: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5320.453569] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8135de7f> {intel_idle+0xbf/0x130}
[ 5320.474703] mce: [Hardware Error]: TSC bab9f2d8a4d60 ADDR bb68ec00 MISC 20403ebe86 
[ 5320.493689] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC e microcode 710
[ 5320.515532] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 5320.529139] EDAC sbridge MC0: CPU 7: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5320.549050] EDAC sbridge MC0: TSC bab9f2d8a4d60 
[ 5320.558870] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86 
[ 5320.576890] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC e
[ 5320.597478] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset:0xc00 grain:32 -  area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
[ 5320.652525] mce: [Hardware Error]: CPU 36: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5320.673804] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8135de7f> {intel_idle+0xbf/0x130}
[ 5320.694918] mce: [Hardware Error]: TSC bab9f2d8a5823 ADDR bb68ec00 MISC 20403ebe86 
[ 5320.713916] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 9 microcode 710
[ 5320.735759] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 5320.749347] EDAC sbridge MC0: CPU 36: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5320.769452] EDAC sbridge MC0: TSC bab9f2d8a5823 
[ 5320.779273] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86 
[ 5320.797296] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 9
[ 5320.817877] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset:0xc00 grain:32 -  area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
[ 5320.872972] mce: [Hardware Error]: CPU 33: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5320.894249] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8135de7f> {intel_idle+0xbf/0x130}
[ 5320.915390] mce: [Hardware Error]: TSC bab9f2d8a5326 ADDR bb68ec00 MISC 20403ebe86 
[ 5320.934374] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 3 microcode 710
[ 5320.956222] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 5320.969807] EDAC sbridge MC0: CPU 33: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5320.989913] EDAC sbridge MC0: TSC bab9f2d8a5326 
[ 5320.999734] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86 
[ 5321.017750] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 3
[ 5321.038284] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset:0xc00 grain:32 -  area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
[ 5321.093686] mce: [Hardware Error]: CPU 1: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5321.114770] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8135de7f> {intel_idle+0xbf/0x130}
[ 5321.135925] mce: [Hardware Error]: TSC bab9f2d8a5562 ADDR bb68ec00 MISC 20403ebe86 
[ 5321.154918] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 2 microcode 710
[ 5321.176765] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 5321.190369] EDAC sbridge MC0: CPU 1: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5321.210303] EDAC sbridge MC0: TSC bab9f2d8a5562 
[ 5321.220123] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86 
[ 5321.238146] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 2
[ 5321.258723] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset:0xc00 grain:32 -  area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
[ 5321.303358] mce: [Hardware Error]: CPU 4: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5321.324279] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8135de7f> {intel_idle+0xbf/0x130}
[ 5321.345397] mce: [Hardware Error]: TSC bab9f2d8a572f ADDR bb68ec00 MISC 20403ebe86 
[ 5321.364380] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 8 microcode 710
[ 5321.386184] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 5321.399729] EDAC sbridge MC0: CPU 4: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5321.419624] EDAC sbridge MC0: TSC bab9f2d8a572f 
[ 5321.429445] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86 
[ 5321.447454] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 8
[ 5321.467989] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset:0xc00 grain:32 -  area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
[ 5321.511475] mce: [Hardware Error]: CPU 32: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5321.532587] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8135de7f> {intel_idle+0xbf/0x130}
[ 5321.553689] mce: [Hardware Error]: TSC bab9f2d8a50f4 ADDR bb68ec00 MISC 20403ebe86 
[ 5321.572681] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 1 microcode 710
[ 5321.594500] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 5321.608057] EDAC sbridge MC0: CPU 32: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5321.628161] EDAC sbridge MC0: TSC bab9f2d8a50f4 
[ 5321.637982] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86 
[ 5321.655998] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 1
[ 5321.676524] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset:0xc00 grain:32 -  area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
[ 5321.720020] mce: [Hardware Error]: CPU 0: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5321.740939] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8135de7f> {intel_idle+0xbf/0x130}
[ 5321.762058] mce: [Hardware Error]: TSC bab9f2d8a5034 ADDR bb68ec00 MISC 20403ebe86 
[ 5321.781022] mce: [Hardware Error]: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 0 microcode 710
[ 5321.802837] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[ 5321.816395] EDAC sbridge MC0: CPU 0: Machine Check Exception: 5 Bank 5: be00000000010090
[ 5321.836300] EDAC sbridge MC0: TSC bab9f2d8a5034 
[ 5321.846121] EDAC sbridge MC0: ADDR bb68ec00 EDAC sbridge MC0: MISC 20403ebe86 
[ 5321.864127] EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1449517966 SOCKET 0 APIC 0
[ 5321.884647] EDAC MC0: 0 UE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xbb68e offset:0xc00 grain:32 -  area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0)
[ 5321.928136] mce: [Hardware Error]: Machine check: Processor context corrupt
[ 5321.945589] Kernel panic - not syncing: Fatal machine check
[ 5321.985122] Kernel Offset: disabled
[ 5322.008492] Rebooting in 100 seconds..
[ 5421.226077] ACPI MEMORY or I/O RESET_REG.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ