[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F03B1C9@ORSMSX104.amr.corp.intel.com>
Date: Mon, 13 Feb 2012 21:29:22 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Mauro Carvalho Chehab <mchehab@...hat.com>,
Borislav Petkov <bp@...64.org>
CC: Linux Edac Mailing List <linux-edac@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v3 00/31] Hardware Events Report Mecanism (HERM)
> For CE, it is able to detect what dimm has problem, but, when lockstep is
> enabled, an Uncorrected Error would point to the two channels at the affected
> branch. Also, when memory mirror is enabled, there are 4 DIMMs associated to the
> same 128-bit memory address. Any one of those memories could be affected by
> the error. Only the Sandy Bridge driver handles memory mirror, and I'll need
> to add some extra logic to the location detect algorithm, in order to work
> with it (it is currently on my TODO list).
This looks like a hard problem to solve. In mirror mode memory writes go to
both sides, reads come from the master. To determine the location of the error
you'd need to know which set was the master at the time the error was detected.
Note that can change (and since we know we just had an error it might have been
switched because of the error we just saw). We don't have any visibility from
the OS as to what the BIOS is doing with mirrors.
-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists