[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F3290F9B0@ORSMSX114.amr.corp.intel.com>
Date: Thu, 23 Oct 2014 17:18:29 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Borislav Petkov <bp@...en8.de>, Chen Yucong <slaoub@...il.com>
CC: Andi Kleen <ak@...ux.intel.com>,
Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Aravind Gopalakrishnan <aravind.gopalakrishnan@....com>
Subject: RE: [PATCH] x86, MCE: support memory error recovery for both UCNA
and Deferred error in machine_check_poll
> The general idea of preemptively poisoning pages which contain deferred
> errors is fine though.
Agreed. I used to think that it wasn't likely to be very useful because in many
cases the UCNA errors are just a trail of breadcrumbs set by different units
on the chip as the poison passed through on the way to consumption - where
there would be a fatal (or recoverable) error.
But recently I found that a partial write to a poisoned cache line only sets the
trail of UCNA errors - there is no consumption, so no machine check. So in
this case it would definitely be worthwhile to trigger the same action that we
do for SRAO to unmap the page before someone does do a read.
-Tony
Powered by blists - more mailing lists