[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aKUKVdonFGwUZI_k@google.com>
Date: Tue, 19 Aug 2025 16:35:49 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Rick P Edgecombe <rick.p.edgecombe@...el.com>
Cc: Adrian Hunter <adrian.hunter@...el.com>, Chao Gao <chao.gao@...el.com>,
Len Brown <len.brown@...el.com>, Kai Huang <kai.huang@...el.com>,
"binbin.wu@...ux.intel.com" <binbin.wu@...ux.intel.com>, Xiaoyao Li <xiaoyao.li@...el.com>,
Reinette Chatre <reinette.chatre@...el.com>,
"kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
"tony.lindgren@...ux.intel.com" <tony.lindgren@...ux.intel.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"pbonzini@...hat.com" <pbonzini@...hat.com>, Isaku Yamahata <isaku.yamahata@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Yan Y Zhao <yan.y.zhao@...el.com>,
Ira Weiny <ira.weiny@...el.com>
Subject: Re: [PATCH RFC 1/2] KVM: TDX: Disable general support for MWAIT in guest
On Tue, Aug 19, 2025, Rick P Edgecombe wrote:
> On Tue, 2025-08-19 at 10:38 +0300, Adrian Hunter wrote:
> > On 18/08/2025 21:49, Edgecombe, Rick P wrote:
> > > Attn: Binbin, Xiaoyao
> > >
> > > On Mon, 2025-08-18 at 07:05 -0700, Sean Christopherson wrote:
> > > > NAK.
> > > >
> > > > Fix the guest, or wherever else in the pile there are issues. KVM is NOT carrying
> > > > hack-a-fixes to workaround buggy software/firmware. Been there, done that.
> > >
> > > Yes, I would have thought we should have at least had a TDX module change option
> > > for this.
> >
> > That would not help with existing TDX Modules, and would possibly require
> > a guest opt-in, which would not help with existing guests. Hence, to start
> > with disabling the feature first, and look for another solution second.
>
> I think you have the priorities wrong. There are only so many kludges we can ask
> KVM to take. Across all the changes people want for TDX, do you think not having
> to update the TDX module, backport a guest fix or even just adjust qemu args is
> more important the other stuff?
I'm especially sensitive to fudging around _bugs_ in other pieces of the stack.
KVM has been burned badly, multiple times, by hacking around issues elsewhere.
There are inevitably cases where throwing something into KVM is the least awful
choice (usually because it's the only feasible choise), but this ain't one of
those cases.
> TDX support is still very early. We need to think about long term sustainable
> solutions. So a fix that doesn't support existing TDX modules or guests (the
> intel_idle fix is also in this category anyway) should absolutely be on the
> table.
>
> >
> > In the MWAIT case, Sean has rejected supporting MSR_PKG_CST_CONFIG_CONTROL
> > even for VMX, because it is an optional MSR, so altering intel_idle is
> > being proposed.
No, I rejected support MSR_PKG_CST_CONFIG_CONTROL _in KVM_ because I don't see
any reason to shove information into KVM. AFAICT, it's not an "architectural"
MSR, and all of KVM's existing handling of truly uarch/model-specific MSRs is
painful and ugly.
And userspace (e.g. QEMU) could support emulate MSR_PKG_CST_CONFIG_CONTROL (and
any other MSRs of that nature) via MSR filters. I doubt the MSR is accessed
outside of boot paths, so the cost of a userspace exit should be a non-issue.
Of course, QEMU probably can't provide useful/accurate information.
One option if there is a super strong need to do so would be to add a "disable
exits" capability to let the guest read package c-state MSRs, but that has
obvious downsides and would still just be fudging around a flawed driver.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 8dbf19aa66ef..c254aa26ff22 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4120,6 +4120,15 @@ void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R);
vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C7_RESIDENCY, MSR_TYPE_R);
}
+ if (kvm_package_cstate_in_guest(vcpu->kvm)) {
+ vmx_disable_intercept_for_msr(vcpu, MSR_PKG_CST_CONFIG_CONTROL, MSR_TYPE_R);
+ vmx_disable_intercept_for_msr(vcpu, MSR_PKG_C2_RESIDENCY, MSR_TYPE_R);
+ vmx_disable_intercept_for_msr(vcpu, MSR_PKG_C3_RESIDENCY, MSR_TYPE_R);
+ vmx_disable_intercept_for_msr(vcpu, MSR_PKG_C6_RESIDENCY, MSR_TYPE_R);
+ vmx_disable_intercept_for_msr(vcpu, MSR_PKG_C8_RESIDENCY, MSR_TYPE_R);
+ vmx_disable_intercept_for_msr(vcpu, MSR_PKG_C9_RESIDENCY, MSR_TYPE_R);
+ vmx_disable_intercept_for_msr(vcpu, MSR_PKG_C10_RESIDENCY, MSR_TYPE_R);
+ }
if (kvm_aperfmperf_in_guest(vcpu->kvm)) {
vmx_disable_intercept_for_msr(vcpu, MSR_IA32_APERF, MSR_TYPE_R);
vmx_disable_intercept_for_msr(vcpu, MSR_IA32_MPERF, MSR_TYPE_R);
Powered by blists - more mailing lists