lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aKUKVdonFGwUZI_k@google.com>
Date: Tue, 19 Aug 2025 16:35:49 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Rick P Edgecombe <rick.p.edgecombe@...el.com>
Cc: Adrian Hunter <adrian.hunter@...el.com>, Chao Gao <chao.gao@...el.com>, 
	Len Brown <len.brown@...el.com>, Kai Huang <kai.huang@...el.com>, 
	"binbin.wu@...ux.intel.com" <binbin.wu@...ux.intel.com>, Xiaoyao Li <xiaoyao.li@...el.com>, 
	Reinette Chatre <reinette.chatre@...el.com>, 
	"kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>, 
	"tony.lindgren@...ux.intel.com" <tony.lindgren@...ux.intel.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>, 
	"pbonzini@...hat.com" <pbonzini@...hat.com>, Isaku Yamahata <isaku.yamahata@...el.com>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Yan Y Zhao <yan.y.zhao@...el.com>, 
	Ira Weiny <ira.weiny@...el.com>
Subject: Re: [PATCH RFC 1/2] KVM: TDX: Disable general support for MWAIT in guest

On Tue, Aug 19, 2025, Rick P Edgecombe wrote:
> On Tue, 2025-08-19 at 10:38 +0300, Adrian Hunter wrote:
> > On 18/08/2025 21:49, Edgecombe, Rick P wrote:
> > > Attn: Binbin, Xiaoyao
> > > 
> > > On Mon, 2025-08-18 at 07:05 -0700, Sean Christopherson wrote:
> > > > NAK.
> > > > 
> > > > Fix the guest, or wherever else in the pile there are issues.  KVM is NOT carrying
> > > > hack-a-fixes to workaround buggy software/firmware.  Been there, done that.
> > > 
> > > Yes, I would have thought we should have at least had a TDX module change option
> > > for this.
> > 
> > That would not help with existing TDX Modules, and would possibly require
> > a guest opt-in, which would not help with existing guests.  Hence, to start
> > with disabling the feature first, and look for another solution second.
> 
> I think you have the priorities wrong. There are only so many kludges we can ask
> KVM to take. Across all the changes people want for TDX, do you think not having
> to update the TDX module, backport a guest fix or even just adjust qemu args is
> more important the other stuff?

I'm especially sensitive to fudging around _bugs_ in other pieces of the stack.
KVM has been burned badly, multiple times, by hacking around issues elsewhere.
There are inevitably cases where throwing something into KVM is the least awful
choice (usually because it's the only feasible choise), but this ain't one of
those cases.

> TDX support is still very early. We need to think about long term sustainable
> solutions. So a fix that doesn't support existing TDX modules or guests (the
> intel_idle fix is also in this category anyway) should absolutely be on the
> table.
> 
> > 
> > In the MWAIT case, Sean has rejected supporting MSR_PKG_CST_CONFIG_CONTROL
> > even for VMX, because it is an optional MSR, so altering intel_idle is
> > being proposed.

No, I rejected support MSR_PKG_CST_CONFIG_CONTROL _in KVM_ because I don't see
any reason to shove information into KVM.  AFAICT, it's not an "architectural"
MSR, and all of KVM's existing handling of truly uarch/model-specific MSRs is
painful and ugly.

And userspace (e.g. QEMU) could support emulate MSR_PKG_CST_CONFIG_CONTROL (and
any other MSRs of that nature) via MSR filters.  I doubt the MSR is accessed
outside of boot paths, so the cost of a userspace exit should be a non-issue.
Of course, QEMU probably can't provide useful/accurate information.

One option if there is a super strong need to do so would be to add a "disable
exits" capability to let the guest read package c-state MSRs, but that has
obvious downsides and would still just be fudging around a flawed driver.

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 8dbf19aa66ef..c254aa26ff22 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4120,6 +4120,15 @@ void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
                vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R);
                vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C7_RESIDENCY, MSR_TYPE_R);
        }
+       if (kvm_package_cstate_in_guest(vcpu->kvm)) {
+               vmx_disable_intercept_for_msr(vcpu, MSR_PKG_CST_CONFIG_CONTROL, MSR_TYPE_R);
+               vmx_disable_intercept_for_msr(vcpu, MSR_PKG_C2_RESIDENCY, MSR_TYPE_R);
+               vmx_disable_intercept_for_msr(vcpu, MSR_PKG_C3_RESIDENCY, MSR_TYPE_R);
+               vmx_disable_intercept_for_msr(vcpu, MSR_PKG_C6_RESIDENCY, MSR_TYPE_R);
+               vmx_disable_intercept_for_msr(vcpu, MSR_PKG_C8_RESIDENCY, MSR_TYPE_R);
+               vmx_disable_intercept_for_msr(vcpu, MSR_PKG_C9_RESIDENCY, MSR_TYPE_R);
+               vmx_disable_intercept_for_msr(vcpu, MSR_PKG_C10_RESIDENCY, MSR_TYPE_R);
+       }
        if (kvm_aperfmperf_in_guest(vcpu->kvm)) {
                vmx_disable_intercept_for_msr(vcpu, MSR_IA32_APERF, MSR_TYPE_R);
                vmx_disable_intercept_for_msr(vcpu, MSR_IA32_MPERF, MSR_TYPE_R);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ