[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F170F3D5B@ORSMSX104.amr.corp.intel.com>
Date: Wed, 25 Apr 2012 17:55:16 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Borislav Petkov <bp@...64.org>,
Mauro Carvalho Chehab <mchehab@...hat.com>
CC: Linux Edac Mailing List <linux-edac@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Doug Thompson <norsk5@...oo.com>
Subject: RE: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic
layers
> And now the question is, when you get a DRAM ECC, how does the hardware
> point to the DIMM in error, does it give you a (channel, slot) tuple
> or a virtual address which you have to un-interleave? From MCA, you're
> getting a virtual address in MC4_ADDR so how do you compute this one
> back to a DIMM?
Right now we have the EDAC driver doing a reverse translation from the
physical address it finds in MC5_ADDR using the SAD/TAD/... register
information to get to a DIMM address.
Some of the same information does get reported by BIOS via HEST to
the ghes driver ... but Linux currently isn't looking at it (this
was the code path to get physical address on Nehalem/Westmere
generations where the h/w didn't always provide a valid address)
See apei_mce_report_mem_error() in mce-apei.c ... the error record
passed in may have a bunch more fields valid which would help in
identifying the DIMM.
-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists