[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F7F612DF4@ORSMSX115.amr.corp.intel.com>
Date: Mon, 4 May 2020 20:05:13 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Andy Lutomirski <luto@...capital.net>
CC: Linus Torvalds <torvalds@...ux-foundation.org>,
"Williams, Dan J" <dan.j.williams@...el.com>,
Andy Lutomirski <luto@...nel.org>,
"Thomas Gleixner" <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"Peter Zijlstra" <peterz@...radead.org>,
Borislav Petkov <bp@...en8.de>,
stable <stable@...r.kernel.org>,
the arch/x86 maintainers <x86@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>,
Paul Mackerras <paulus@...ba.org>,
"Benjamin Herrenschmidt" <benh@...nel.crashing.org>,
"Tsaur, Erwin" <erwin.tsaur@...el.com>,
Michael Ellerman <mpe@...erman.id.au>,
"Arnaldo Carvalho de Melo" <acme@...nel.org>,
linux-nvdimm <linux-nvdimm@...ts.01.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v2 0/2] Replace and improve "mcsafe" with copy_safe()
> When a copy function hits a bad page and the page is not yet known to
> be bad, what does it do? (I.e. the page was believed to be fine but
> the copy function gets #MC.) Does it unmap it right away? What does
> it return?
I suspect that we will only ever find a handful of situations where the
kernel can recover from memory that has gone bad that are worth fixing
(got to be some code path that touches a meaningful fraction of memory,
otherwise we get code complexity without any meaningful payoff).
I don't think we'd want different actions for the cases of "we just found out
now that this page is bad" and "we got a notification an hour ago that this
page had gone bad". Currently we treat those the same for application
errors ... SIGBUS either way[1].
-Tony
[1] well there are options both globally and at the per-process level to have
the "early" notifications delivered right away.
Powered by blists - more mailing lists