[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250812122859.70911-2-adrian.hunter@intel.com>
Date: Tue, 12 Aug 2025 15:28:58 +0300
From: Adrian Hunter <adrian.hunter@...el.com>
To: Tony Luck <tony.luck@...el.com>,
pbonzini@...hat.com,
seanjc@...gle.com
Cc: vannapurve@...gle.com,
Borislav Petkov <bp@...en8.de>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
x86@...nel.org,
H Peter Anvin <hpa@...or.com>,
linux-edac@...r.kernel.org,
linux-kernel@...r.kernel.org,
kvm@...r.kernel.org,
rick.p.edgecombe@...el.com,
kai.huang@...el.com,
reinette.chatre@...el.com,
xiaoyao.li@...el.com,
tony.lindgren@...ux.intel.com,
binbin.wu@...ux.intel.com,
ira.weiny@...el.com,
isaku.yamahata@...el.com,
Fan Du <fan.du@...el.com>,
Yazen Ghannam <yazen.ghannam@....com>,
yan.y.zhao@...el.com,
chao.gao@...el.com
Subject: [PATCH V2 1/2] x86/mce: Fix missing address mask in recovery for errors in TDX/SEAM non-root mode
Commit 8a01ec97dc066 ("x86/mce: Mask out non-address bits from machine
check bank") introduced a new #define MCI_ADDR_PHYSADDR for the mask of
valid physical address bits within the machine check bank address register.
This is particularly needed in the case of errors in TDX/SEAM non-root mode
because the reported address contains the TDX KeyID. Refer to TDX and
TME-MK documentation for more information about KeyIDs.
Commit 7911f145de5fe ("x86/mce: Implement recovery for errors in TDX/SEAM
non-root mode") uses the address to mark the affected page as poisoned, but
omits to use the aforementioned mask.
Investigation of user space expectations has concluded it would be more
correct for the address to contain only address bits in the first place.
Refer https://lore.kernel.org/r/807ff02d-7af0-419d-8d14-a4d6c5d5420d@intel.com
Mask the address when it is read from the machine check bank address
register. Do not use MCI_ADDR_PHYSADDR because that will be removed in a
later patch.
It is assumed __log_error() in arch/x86/kernel/cpu/mce/amd.c does not need
similar treatment.
Amend struct mce addr member description slightly to reflect that it is
not, and never has been, an exact copy of the bank's MCi_ADDR MSR.
Fixes: 8a01ec97dc066 ("x86/mce: Mask out non-address bits from machine check bank")
Fixes: 7911f145de5fe ("x86/mce: Implement recovery for errors in TDX/SEAM non-root mode")
Link: https://lore.kernel.org/r/807ff02d-7af0-419d-8d14-a4d6c5d5420d@intel.com
Cc: stable@...r.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@...el.com>
---
Changes in V2:
Mask address when it is read
Amend struct mce addr description
arch/x86/include/uapi/asm/mce.h | 2 +-
arch/x86/kernel/cpu/mce/core.c | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index cb6b48a7c22b..abf6ee43f5f8 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -14,7 +14,7 @@
struct mce {
__u64 status; /* Bank's MCi_STATUS MSR */
__u64 misc; /* Bank's MCi_MISC MSR */
- __u64 addr; /* Bank's MCi_ADDR MSR */
+ __u64 addr; /* Address from bank's MCi_ADDR MSR */
__u64 mcgstatus; /* Machine Check Global Status MSR */
__u64 ip; /* Instruction Pointer when the error happened */
__u64 tsc; /* CPU time stamp counter */
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 4da4eab56c81..deb47463a75d 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -699,6 +699,9 @@ static noinstr void mce_read_aux(struct mce_hw_err *err, int i)
}
smca_extract_err_addr(m);
+
+ /* Mask out non-address bits, such as TDX KeyID */
+ m->addr &= GENMASK_ULL(boot_cpu_data.x86_phys_bits - 1, 0);
}
if (mce_flags.smca) {
--
2.48.1
Powered by blists - more mailing lists