[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <uchg74rtpcpwlkxgqww2n6nh23p4ouaswqc737xy7y6rqzowtb@pbf4whogx2s4>
Date: Mon, 17 Mar 2025 13:03:38 +0200
From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
To: David Woodhouse <dwmw2@...radead.org>
Cc: "Huang, Kai" <kai.huang@...el.com>, Xiaoyao Li <xiaoyao.li@...el.com>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org, "Rafael J. Wysocki" <rafael@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, Adrian Hunter <adrian.hunter@...el.com>,
Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>, Elena Reshetova <elena.reshetova@...el.com>,
Jun Nakajima <jun.nakajima@...el.com>, Rick Edgecombe <rick.p.edgecombe@...el.com>,
Tom Lendacky <thomas.lendacky@....com>, "Kalra, Ashish" <ashish.kalra@....com>,
Sean Christopherson <seanjc@...gle.com>, Baoquan He <bhe@...hat.com>, kexec@...ts.infradead.org,
linux-coco@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCHv9 05/17] x86/kexec: Keep CR4.MCE set during kexec for TDX
guest
On Mon, Mar 17, 2025 at 09:27:16AM +0000, David Woodhouse wrote:
> On Thu, 2024-04-04 at 12:32 +0300, Kirill A. Shutemov wrote:
> > On Thu, Apr 04, 2024 at 10:40:34AM +1300, Huang, Kai wrote:
> > >
> > >
> > > On 3/04/2024 4:42 am, Kirill A. Shutemov wrote:
> > > > On Fri, Mar 29, 2024 at 06:48:21PM +0200, Kirill A. Shutemov wrote:
> > > > > On Fri, Mar 29, 2024 at 11:21:32PM +0800, Xiaoyao Li wrote:
> > > > > > On 3/25/2024 6:38 PM, Kirill A. Shutemov wrote:
> > > > > > > TDX guests are not allowed to clear CR4.MCE. Attempt to clear it leads
> > > > > > > to #VE.
> > > > > >
> > > > > > Will we consider making it more safe and compatible for future to guard
> > > > > > against X86_FEATURE_MCE as well?
> > > > > >
> > > > > > If in the future, MCE becomes configurable for TD guest, then CR4.MCE might
> > > > > > not be fixed1.
> > > > >
> > > > > Good point.
> > > > >
> > > > > I guess we can leave it clear if it was clear. This should be easy
> > > > > enough. But we might want to clear even if was set if clearing is allowed.
> > > > >
> > > > > It would require some kind of indication that clearing MCE is fine. We
> > > > > don't have such indication yet. Not sure we can reasonably future-proof
> > > > > the code at this point.
> > > > >
> > > > > But let me think more.
> > > >
> > > > I think I will go with the variant below.
> > > >
> > > > diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> > > > index 56cab1bb25f5..8e2037d78a1f 100644
> > > > --- a/arch/x86/kernel/relocate_kernel_64.S
> > > > +++ b/arch/x86/kernel/relocate_kernel_64.S
> > > > @@ -5,6 +5,8 @@
> > > > */
> > > > #include <linux/linkage.h>
> > > > +#include <linux/stringify.h>
> > > > +#include <asm/alternative.h>
> > > > #include <asm/page_types.h>
> > > > #include <asm/kexec.h>
> > > > #include <asm/processor-flags.h>
> > > > @@ -145,11 +147,17 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
> > > > * Set cr4 to a known state:
> > > > * - physical address extension enabled
> > > > * - 5-level paging, if it was enabled before
> > > > + * - Machine check exception on TDX guest, if it was enabled before.
> > > > + * Clearing MCE might not allowed in TDX guests, depending on setup.
> > >
> > > Nit: Perhaps we can just call out:
> > >
> > > Clearing MCE is not allowed if it _was_ enabled before.
> > >
> > > Which is always true I suppose.
> >
> > It is true now. Future TDX will allow to clear CR4.MCE and we don't want
> > to flip it back on in this case.
>
> And yet v12 of the patch which became commit de60613173df does
> precisely that.
>
> It uses the original contents of CR4 which are stored in %r13 (instead
> of building a completely new set of bits for CR4 as before). So it
> would never have *cleared* the CR4.MCE bit now anyway... what it does
> is explicitly *set* the bit even if it wasn't set before?
>
> This is what got committed, and I think we can just drop the
> ALTERNATIVE line completely because it's redundant in the case that
> CR4.MCE was already set, and *wrong* in the case that it wasn't already
> set?
But we AND R13 against $(X86_CR4_PAE | X86_CR4_LA57). We will lose MCE if
drop the ALTERNATIVE.
And we don't want MCE to be enabled during kexec for !TDX_GUEST:
https://lore.kernel.org/all/1144340e-dd95-ee3b-dabb-579f9a65b3c7@citrix.com/
I think we should patch AND instruction to include X86_CR4_MCE on
TDX_GUEST:
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index b44d8863e57f..f6c552a39815 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -148,8 +148,8 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
* Use R13 that contains the original CR4 value, read in relocate_kernel().
* PAE is always set in the original CR4.
*/
- andl $(X86_CR4_PAE | X86_CR4_LA57), %r13d
- ALTERNATIVE "", __stringify(orl $X86_CR4_MCE, %r13d), X86_FEATURE_TDX_GUEST
+ ALTERNATIVE __stringify(andl $(X86_CR4_PAE | X86_CR4_LA57), %r13d), \
+ __stringify(andl $(X86_CR4_PAE | X86_CR4_LA57 | X86_CR4_MCE), %r13d), X86_FEATURE_TDX_GUEST
movq %r13, %cr4
/* Flush the TLB (needed?) */
--
Kiryl Shutsemau / Kirill A. Shutemov
Powered by blists - more mailing lists