[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F31E21864@ORSMSX106.amr.corp.intel.com>
Date: Tue, 8 Apr 2014 22:34:15 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Jason Baron <jbaron@...mai.com>, Borislav Petkov <bp@...en8.de>
CC: "hpa@...or.com" <hpa@...or.com>,
"mingo@...nel.org" <mingo@...nel.org>,
"dougthompson@...ssion.com" <dougthompson@...ssion.com>,
"m.chehab@...sung.com" <m.chehab@...sung.com>,
"mitake@....info.waseda.ac.jp" <mitake@....info.waseda.ac.jp>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH 3/3] ie31200_edac: Add driver
>> Btw, this driver is polling, AFAICT. Doesn't e3-12xx support the CMCI
>> interrupt which you can feed into this driver directly and thus not need
>> the polling at all?
>
> On the system with the ce and ue events that I'm testing on, I don't see
> 'MCE' nudge above 0, in /proc/interrupts. So I think that implies that
> we are not getting any CMCI there?
CMCI will bump up the "THR" (Threshold) entries in /proc/interrupts.
> So if possible maybe we can confirm with Intel whether we expect an MCE
> for memory errors...
MCG_CAP bit 10 tells you whether a given processor implements CMCI.
If that is set - then MCi_CTL2 bit 30 indicates whether a given bank
supports it (Linux tries to set this bit, if it sticks, then it knows that CMCI
is supported - Linux also assigns ownership of the bank to the first cpu
to successfully set it (since a bank may be shared by multiple threads/cores
on a package).
Consumed uncorrectable errors should generate a machine check. Which
on the E3-12xx series will be a fatal machine check: MCi_STATUS.PCC=1
-Tony
Powered by blists - more mailing lists