[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7c5ae62f-c4c7-41d8-af00-7a517e3ed309@intel.com>
Date: Wed, 27 Aug 2025 11:22:07 +0300
From: Adrian Hunter <adrian.hunter@...el.com>
To: Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
Tony Luck <tony.luck@...el.com>
CC: <pbonzini@...hat.com>, <seanjc@...gle.com>, <vannapurve@...gle.com>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
<x86@...nel.org>, H Peter Anvin <hpa@...or.com>,
<linux-edac@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<kvm@...r.kernel.org>, <rick.p.edgecombe@...el.com>, <kai.huang@...el.com>,
<reinette.chatre@...el.com>, <xiaoyao.li@...el.com>,
<tony.lindgren@...ux.intel.com>, <binbin.wu@...ux.intel.com>,
<ira.weiny@...el.com>, <isaku.yamahata@...el.com>, Fan Du <fan.du@...el.com>,
Yazen Ghannam <yazen.ghannam@....com>, <yan.y.zhao@...el.com>,
<chao.gao@...el.com>
Subject: Re: [PATCH RESEND V2 1/2] x86/mce: Fix missing address mask in
recovery for errors in TDX/SEAM non-root mode
On 20/08/2025 00:32, Borislav Petkov wrote:
> On Tue, Aug 19, 2025 at 07:24:34PM +0300, Adrian Hunter wrote:
>> Commit 8a01ec97dc066 ("x86/mce: Mask out non-address bits from machine
>> check bank") introduced a new #define MCI_ADDR_PHYSADDR for the mask of
>> valid physical address bits within the machine check bank address register.
>>
>> This is particularly needed in the case of errors in TDX/SEAM non-root mode
>> because the reported address contains the TDX KeyID. Refer to TDX and
>> TME-MK documentation for more information about KeyIDs.
>>
>> Commit 7911f145de5fe ("x86/mce: Implement recovery for errors in TDX/SEAM
>> non-root mode") uses the address to mark the affected page as poisoned, but
>> omits to use the aforementioned mask.
>>
>> Investigation of user space expectations has concluded it would be more
>> correct for the address to contain only address bits in the first place.
>> Refer https://lore.kernel.org/r/807ff02d-7af0-419d-8d14-a4d6c5d5420d@intel.com
>>
>> Mask the address when it is read from the machine check bank address
>> register. Do not use MCI_ADDR_PHYSADDR because that will be removed in a
>> later patch.
>
> Why is this patch talking about TDX-something but doing "global" changes to
> mce.addr?
>
> Why don't you simply do a TDX-specific masking out when you're running on
> in TDX env and leave the rest as is?
>
How about this?
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 4da4eab56c81..3963d4cd8fc1 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -699,6 +699,8 @@ static noinstr void mce_read_aux(struct mce_hw_err *err, int i)
}
smca_extract_err_addr(m);
+
+ tdx_extract_err_addr(m);
}
if (mce_flags.smca) {
diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h
index b5ba598e54cb..fcf0b84a7c98 100644
--- a/arch/x86/kernel/cpu/mce/internal.h
+++ b/arch/x86/kernel/cpu/mce/internal.h
@@ -298,6 +298,16 @@ static inline bool amd_mce_usable_address(struct mce *m) { return false; }
static inline void smca_extract_err_addr(struct mce *m) { }
#endif
+#ifdef CONFIG_X86_MCE_INTEL
+static __always_inline void tdx_extract_err_addr(struct mce *m)
+{
+ if (boot_cpu_has(X86_FEATURE_TDX_HOST_PLATFORM))
+ m->addr &= GENMASK_ULL(boot_cpu_data.x86_phys_bits - 1, 0);
+}
+#else
+static inline void tdx_extract_err_addr(struct mce *m) { }
+#endif
+
#ifdef CONFIG_X86_ANCIENT_MCE
void intel_p5_mcheck_init(struct cpuinfo_x86 *c);
void winchip_mcheck_init(struct cpuinfo_x86 *c);
Powered by blists - more mailing lists