lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ea0b0b1a842ad1fc209438c776f68ffb4ac17b9f.camel@intel.com>
Date: Thu, 17 Apr 2025 18:21:13 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "tglx@...utronix.de" <tglx@...utronix.de>, "peterz@...radead.org"
	<peterz@...radead.org>, "mingo@...hat.com" <mingo@...hat.com>, "Hansen, Dave"
	<dave.hansen@...el.com>, "Huang, Kai" <kai.huang@...el.com>, "bp@...en8.de"
	<bp@...en8.de>
CC: "ashish.kalra@....com" <ashish.kalra@....com>, "seanjc@...gle.com"
	<seanjc@...gle.com>, "x86@...nel.org" <x86@...nel.org>, "sagis@...gle.com"
	<sagis@...gle.com>, "hpa@...or.com" <hpa@...or.com>, "Chatre, Reinette"
	<reinette.chatre@...el.com>, "kirill.shutemov@...ux.intel.com"
	<kirill.shutemov@...ux.intel.com>, "Williams, Dan J"
	<dan.j.williams@...el.com>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
	"thomas.lendacky@....com" <thomas.lendacky@....com>, "Yamahata, Isaku"
	<isaku.yamahata@...el.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "nik.borisov@...e.com" <nik.borisov@...e.com>
Subject: Re: [PATCH] x86/virt/tdx: Make TDX and kexec mutually exclusive at
 runtime

On Thu, 2025-04-17 at 10:50 -0700, Dave Hansen wrote:
> On 4/16/25 16:02, Kai Huang wrote:
> > Full support for kexec on a TDX host would require complex work.
> > The cache flushing required would need to happen while stopping
> > remote CPUs, which would require changes to a fragile area of the
> > kernel.
> 
> Doesn't kexec already stop remote CPUs? Doesn't this boil down to a
> WBINVD? How is that complex?

When SME added an SME-only WBINVD in stop_this_cpu() it caused a shutdown hang
on some particular HW. It turns out there was an existing race that was made
worse by the slower operation. It went through some attempts to fix it, and
finally tglx patched it up with:

  1f5e7eb7868e ("x86/smp: Make stop_other_cpus() more robust")

But in that patch he said the fix "cannot plug all holes either". So while
looking at doing the WBINVD for TDX kexec, I was advocating for giving this a
harder look before building on top of it. The patches to add TDX kexec support
made the WBINVD happen on all bare metal, not just TDX HW. So whatever races
exist would be exposed to a much wider variety of HW than SME tested out.

> 
> > It would also require resetting TDX private pages, which is non-
> > trivial since the core kernel does not track them.
> 
> Why? The next kernel will just use KeyID-0 which will blast the old
> pages away with no side effects... right?

I believe this is talking about support to work around the #MC errata. Another
version of kexec TDX support used a KVM callback to have it reset all the TDX
guest memory it knows about.

> 
> > Lastly, it would have to rely on a yet-to-be documented behavior
> > around the TME key (KeyID 0).
> 
> I'll happily wait for the documentation if you insist on it (I don't).

Ok, thanks. This one is probably more of a bonus reason on top of the above.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ