linux-kernel - Re: [PATCH v2 1/1] arch/fault: don't print logs for pte marker poison errors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zk5rsvMs6qVPAw52@x1n>
Date: Wed, 22 May 2024 18:03:30 -0400
From: Peter Xu <peterx@...hat.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Axel Rasmussen <axelrasmussen@...gle.com>,
	Oscar Salvador <osalvador@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Andy Lutomirski <luto@...nel.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@...nel.org>,
	Christophe Leroy <christophe.leroy@...roup.eu>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	David Hildenbrand <david@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, Helge Deller <deller@....de>,
	Ingo Molnar <mingo@...hat.com>,
	"James E.J. Bottomley" <James.Bottomley@...senpartnership.com>,
	John Hubbard <jhubbard@...dia.com>,
	Liu Shixin <liushixin2@...wei.com>,
	"Matthew Wilcox (Oracle)" <willy@...radead.org>,
	Michael Ellerman <mpe@...erman.id.au>,
	Muchun Song <muchun.song@...ux.dev>,
	"Naveen N. Rao" <naveen.n.rao@...ux.ibm.com>,
	Nicholas Piggin <npiggin@...il.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Suren Baghdasaryan <surenb@...gle.com>,
	Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, linux-parisc@...r.kernel.org,
	linuxppc-dev@...ts.ozlabs.org, x86@...nel.org
Subject: Re: [PATCH v2 1/1] arch/fault: don't print logs for pte marker
 poison errors

On Wed, May 15, 2024 at 10:18:31PM +0200, Borislav Petkov wrote:
> So if I were to design this, I'd do it this way:
> 
> 0. guest gets hw poison injected
> 
> 1. it runs memory_failure() and it kills the processes using the page.
> 
> 2. page is marked poisoned on the host so no other guest gets it.
> 
> That's it. No second accesses whatsoever. At least this is how it works
> on baremetal.
> 
> This hw poisoning emulation is just silly and unnecessary.

We (QEMU) haven't yet consumed this.. but I think it makes sense to have
such emulation, as it's slightly different from a real hwpoison.

I think the important bit that's missing in this picture is migration, that
the VM can migrate from one host to another, carrying that poisoned PFN.

Let's assume we have two hosts: src and dst.  Currently VM runs on src
host.

Before migration, there is a real PFN that is bad, MCE injected. When
accesssed by either guest vcpu or host cpu / hypervisor, VM gets killed.
This is so far the same to any process that has a bad page.

However it's possible a VM got migrated _before_ that bad PFN accessed, in
this case the VM is still legal to run, the hypervisor will not migrate
that bad PFN data knowing that its data is invalid.  What it does is it'll
tell dst that "this guest PFN is bad, if guest access it let's crash it".
Then what dst host needs is a way to describe "this guest PFN is bad": the
easiest way is to describe "this VA of the process is bad", meanwhile
there'll be no real page backing that VA anyway, and also no real poisoned
pages.  We want to poison a VA only. That's why an emulation is needed.
Besides that we want to get exactly whatever we'll get for a real hwpoison,
e.g. SIGBUS with the address encoded, then KVM work naturally with that
just like a real MCE.

One other thing we can do is to inject-poison to the VA together with the
page backing it, but that'll pollute a PFN on dst host to be a real bad PFN
and won't be able to be used by the dst OS anymore, so it's less optimal.

Thanks,

-- 
Peter Xu