[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F329282FA@ORSMSX114.amr.corp.intel.com>
Date: Mon, 10 Nov 2014 23:32:12 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Borislav Petkov <bp@...en8.de>,
Aravind Gopalakrishnan <aravind.gopalakrishnan@....com>
CC: Chen Yucong <slaoub@...il.com>,
"ak@...ux.intel.com" <ak@...ux.intel.com>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v3 1/2] x86, mce, severity: extend the the mce_severity
mechanism to handle UCNA/DEFERRED error
But then I tested it ...
I injected a UC error to memory - then did a simple byte write to the target line.
This resulted in two banks logging errors:
[ 124.638045] poll: CPU54 saw ec00000000010092 in bank 7
[ 124.639006] poll: severity = 0
[ 124.647333] poll: CPU54 saw b800000000200179 in bank 3
[ 124.648322] poll: severity = 1
The bank 7 error reported as severity 0 because EN=0 ... so we took no action for it.
The bank 3 error got past that hurdle, then through the next BIT(8) set indicates a
cache error. Fell at the last check because ADDRV=0.
I think the severity table entry for the "EN" check should have been skipped
when calling from the CMCI handler. Then we would have seen severity=1
from the bank 7 error. It would have passed the other tests too (BIT(7) and
ADDRV).
-Tony
Powered by blists - more mailing lists