[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <38de4f009d3248f7bc7c99f29d34ac8a@intel.com>
Date: Thu, 3 Sep 2020 17:09:43 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Borislav Petkov <bp@...en8.de>
CC: Naoya Horiguchi <naoya.horiguchi@....com>,
Andrew Morton <akpm@...ux-foundation.org>,
"Song, Youquan" <youquan.song@...el.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [RFD PATCH] x86/mce: Make sure to send SIGBUS even after losing
the race to poison a page
> Let's see if that logic makes sense: if #MC offlines the page and sends
> SIGBUS but CMCI only offlines the page, isn't it only logical for the
> CMCI to *also* send the SIGBUS too, after having offlined the page?
>
> I.e., both should do the proper and full recovery action. Just sayin...
It made sense, and seemed to explain an issue I was seeing, when I wrote it.
But some stress testing of that patch showed that it introduces some problems
and instability.
Without the patch I can inject 10,000 errors and have every one of them complete
correctly (process gets a SIGBUS with the address of the error). With my patch
around 0.4% of injections fail to provide the address to the SIGBUS handler, worse
the test gets a fatal error every 600-700 injections.
So, I'm abandoning that patch.
-Tony
Powered by blists - more mailing lists