lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5ef7770cce1e78344a94a6b6c58eca78c616bbb1.camel@intel.com>
Date: Fri, 27 Jun 2025 00:56:07 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "Luck, Tony" <tony.luck@...el.com>, "Hansen, Dave"
	<dave.hansen@...el.com>, "Hunter, Adrian" <adrian.hunter@...el.com>,
	"Annapurve, Vishal" <vannapurve@...gle.com>
CC: "kvm@...r.kernel.org" <kvm@...r.kernel.org>, "Li, Xiaoyao"
	<xiaoyao.li@...el.com>, "Zhao, Yan Y" <yan.y.zhao@...el.com>,
	"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "Chatre,
 Reinette" <reinette.chatre@...el.com>, "seanjc@...gle.com"
	<seanjc@...gle.com>, "tony.lindgren@...ux.intel.com"
	<tony.lindgren@...ux.intel.com>, "tglx@...utronix.de" <tglx@...utronix.de>,
	"Yamahata, Isaku" <isaku.yamahata@...el.com>, "pbonzini@...hat.com"
	<pbonzini@...hat.com>, "binbin.wu@...ux.intel.com"
	<binbin.wu@...ux.intel.com>, "linux-edac@...r.kernel.org"
	<linux-edac@...r.kernel.org>, "hpa@...or.com" <hpa@...or.com>,
	"mingo@...hat.com" <mingo@...hat.com>, "Edgecombe, Rick P"
	<rick.p.edgecombe@...el.com>, "kirill.shutemov@...ux.intel.com"
	<kirill.shutemov@...ux.intel.com>, "bp@...en8.de" <bp@...en8.de>,
	"x86@...nel.org" <x86@...nel.org>, "Gao, Chao" <chao.gao@...el.com>
Subject: Re: [PATCH 2/2] KVM: TDX: Do not clear poisoned pages

On Thu, 2025-06-26 at 15:33 -0700, Hansen, Dave wrote:
> On 6/26/25 15:20, Huang, Kai wrote:
> > But IMHO we may should just have a simple policy that when a page is marked
> > as poisoned, it should never be touched again.  It's only one page anyway
> > (for one TD) so losing that doesn't seem bad to me.  If we want to clear the
> > poisoned page, then perhaps we should mark that page to be not-poisoned
> > again.
> 
> The simplest policy is to do nothing.
> 
> The kernel only has 29 places that check PageHWPoison(). I'd guess that
> roughly half of those are the memory-failure.c infrastructure and
> bare-minimum code to handle poison, like not allowing pages to go back
> into the allocator.
> 
> There are something like 5,000 lines of code in the kernel that deal
> with a literal 'struct page'. 29 checks for ~5,000 sites is pretty
> minuscule. We obviously don't have a policy that every place that uses
> 'struct page' needs to check for poison. We also don't even have a
> policy where writes to or reads from a page check for poison.

It was my understanding that if page is marked as poisoned the kernel should
not touch that again.  I thought the kernel should have already implemented
in this way, like not allowing pages to go back to the allocator, and the
places that use 'struct page' you mentioned should already know the page is
not poisoned.

That being said it's just my guess, so my bad.

> 
> Why is this TDX code so special that PageHWPoison() needs to be checked.
> For instance:
> 
> $ grep -r PageHWPoison arch/x86/
> arch/x86/kernel/cpu/mce/core.c:	SetPageHWPoison(p);
> arch/x86/kernel/cpu/mce/core.c:	SetPageHWPoison(p);
> 
> In other words, this would be the *ONLY* arch/x86 site. Why?

This is the case that I know the kernel could touch poisoned page.  And I
didn't know writing to hardware error memory is fine, but I thought we
should just skip it for safety.

But per Tony it should be fine to write to it, so I am fine to not have the
PageHWPoison() check here.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ