lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aC-otXnBwHsdZ7B4@google.com>
Date: Thu, 22 May 2025 15:44:05 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Maxim Levitsky <mlevitsk@...hat.com>
Cc: kvm@...r.kernel.org, "H. Peter Anvin" <hpa@...or.com>, 
	Thomas Gleixner <tglx@...utronix.de>, Dave Hansen <dave.hansen@...ux.intel.com>, 
	Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org, 
	x86@...nel.org, Paolo Bonzini <pbonzini@...hat.com>, Chao Gao <chao.gao@...el.com>, 
	Jim Mattson <jmattson@...gle.com>
Subject: Re: [PATCH v5 3/5] KVM: nVMX: check vmcs12->guest_ia32_debugctl value
 given by L2

+Jim

On Thu, May 22, 2025, Sean Christopherson wrote:
> On Wed, May 21, 2025, Maxim Levitsky wrote:
> > Check the vmcs12 guest_ia32_debugctl value before loading it, to avoid L2
> > being able to load arbitrary values to hardware IA32_DEBUGCTL.
> > 
> > Reviewed-by: Chao Gao <chao.gao@...el.com>
> > Signed-off-by: Maxim Levitsky <mlevitsk@...hat.com>
> > ---
> >  arch/x86/kvm/vmx/nested.c | 3 ++-
> >  arch/x86/kvm/vmx/vmx.c    | 2 +-
> >  arch/x86/kvm/vmx/vmx.h    | 1 +
> >  3 files changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index e073e3008b16..00f2b762710c 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -3146,7 +3146,8 @@ static int nested_vmx_check_guest_state(struct kvm_vcpu *vcpu,
> >  		return -EINVAL;
> >  
> >  	if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS) &&
> > -	    CC(!kvm_dr7_valid(vmcs12->guest_dr7)))
> > +	    (CC(!kvm_dr7_valid(vmcs12->guest_dr7)) ||
> > +	     CC(vmcs12->guest_ia32_debugctl & ~vmx_get_supported_debugctl(vcpu, false))))
> 
> This is a breaking change.  For better or worse (read: worse), KVM's ABI is to
> drop BTF and LBR if they're unsupported (the former is always unsupported).
> Failure to honor that ABI means L1 can't excplitly load what it think is its
> current value into L2.
> 
> I'll slot in a path to provide another helper for checking the validity of
> DEBUGCTL.  I think I've managed to cobble together something that isn't too
> horrific (options are a bit limited due to the existing ugliness).

And then Jim ruined my day.  :-)

As evidenced by this hilarious KUT testcase, it's entirely possible there are
existing KVM guests that are utilizing/abusing DEBUGCTL features.

	/*
	 * Optional RTM test for hardware that supports RTM, to
	 * demonstrate that the current volume 3 of the SDM
	 * (325384-067US), table 27-1 is incorrect. Bit 16 of the exit
	 * qualification for debug exceptions is not reserved. It is
	 * set to 1 if a debug exception (#DB) or a breakpoint
	 * exception (#BP) occurs inside an RTM region while advanced
	 * debugging of RTM transactional regions is enabled.
	 */
	if (this_cpu_has(X86_FEATURE_RTM)) {
		vmcs_write(ENT_CONTROLS,
			   vmcs_read(ENT_CONTROLS) | ENT_LOAD_DBGCTLS);
		/*
		 * Set DR7.RTM[bit 11] and IA32_DEBUGCTL.RTM[bit 15]
		 * in the guest to enable advanced debugging of RTM
		 * transactional regions.
		 */
		vmcs_write(GUEST_DR7, BIT(11));
		vmcs_write(GUEST_DEBUGCTL, BIT(15));
		single_step_guest("Hardware delivered single-step in "
				  "transactional region", starting_dr6, 0);
		check_db_exit(false, false, false, &xbegin, BIT(16),
			      starting_dr6);
	} else {
		vmcs_write(GUEST_RIP, (u64)&skip_rtm);
		enter_guest();
	}

For RTM specifically, disallowing DEBUGCTL.RTM but allowing DR7.RTM seems a bit
silly.  Unless there's a security concern, that can probably be fixed by adding
support for RTM.  Note, there's also a virtualization hole here, as KVM doesn't
vet DR7 beyond checking that bits 63:32 are zero, i.e. a guest could set DR7.RTM
even if KVM doesn't advertise support.  Of course, closing that hole would require
completely dropping support for disabling DR interception, since VMX doesn't
give per-DR controls.

For the other bits, I don't see a good solution.  The only viable options I see
are to silently drop all unsupported bits (maybe with a quirk?), or enforce all
bits and cross our fingers that no L1 VMM is running guests with those bits set
in GUEST_DEBUGCTL.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ