[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <287caf7fda25b8ac27211d2e50fa1077e0bf0bf6.camel@intel.com>
Date: Wed, 30 Jul 2025 11:57:08 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "Luck, Tony" <tony.luck@...el.com>, "Hunter, Adrian"
<adrian.hunter@...el.com>, "Hansen, Dave" <dave.hansen@...el.com>,
"Annapurve, Vishal" <vannapurve@...gle.com>
CC: "kvm@...r.kernel.org" <kvm@...r.kernel.org>, "Li, Xiaoyao"
<xiaoyao.li@...el.com>, "Zhao, Yan Y" <yan.y.zhao@...el.com>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"tony.lindgren@...ux.intel.com" <tony.lindgren@...ux.intel.com>,
"binbin.wu@...ux.intel.com" <binbin.wu@...ux.intel.com>, "Chatre, Reinette"
<reinette.chatre@...el.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "mingo@...hat.com" <mingo@...hat.com>,
"Yamahata, Isaku" <isaku.yamahata@...el.com>,
"kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
"tglx@...utronix.de" <tglx@...utronix.de>, "linux-edac@...r.kernel.org"
<linux-edac@...r.kernel.org>, "hpa@...or.com" <hpa@...or.com>,
"seanjc@...gle.com" <seanjc@...gle.com>, "pbonzini@...hat.com"
<pbonzini@...hat.com>, "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
"bp@...en8.de" <bp@...en8.de>, "Gao, Chao" <chao.gao@...el.com>,
"x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH 1/2] x86/mce: Fix missing address mask in recovery for
errors in TDX/SEAM non-root mode
On Wed, 2025-07-30 at 13:54 +0300, Adrian Hunter wrote:
> But there are also additional places where it seems like MCI_ADDR_PHYSADDR
> is missing:
>
> tdx_dump_mce_info()
> paddr_is_tdx_private()
> __seamcall_ret(TDH_PHYMEM_PAGE_RDMD, &args)
> TDH_PHYMEM_PAGE_RDMD expects KeyID bits to be zero
This is only called in mce_panic() path, which basically means the #MC is
fatal, e.g., happens in kernel context.
The intention of this is to catch any #MC due to kernel bug (i.e.,
software issue, but not hardware error) which does partial write to TDX
private memory and read at a later time, and report a more precise
information to the user to point out this could be due to "possible kernel
bug". See changelog of 70060463cb2b ("x86/mce: Differentiate real
hardware #MCs from TDX erratum ones").
In other words, for this case the address reported via MCI_ADDR_PHYSADDR
should not contain any KeyID bits since the kernel always uses keyID 0 to
read.
I believe the KeyID bits will only be appended to the physical address
reported in MCI_ADDR_PHYSADDR when the #MC was triggered from TDX guest,
i.e., when the CPU was accessing memory using TDX KeyID. Such #MC is not
fatal and won't call into mce_panic().
That being said, for tdx_dump_mce_info(), while explicitly masking out
keyID bits in MCI_ADDR_PHYSADDR obviously doesn't hurt (or arguably better
in some way), it is not necessary AFAICT.
Powered by blists - more mailing lists