[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CALMp9eTHefPLtNwCgqieWbA4tE489Wyi_gVO9P59YxU+k0wgyg@mail.gmail.com>
Date: Fri, 9 Jan 2026 11:33:18 -0800
From: Jim Mattson <jmattson@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Maxim Levitsky <mlevitsk@...hat.com>, kvm@...r.kernel.org,
"H. Peter Anvin" <hpa@...or.com>, Thomas Gleixner <tglx@...utronix.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, Borislav Petkov <bp@...en8.de>,
Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org, x86@...nel.org,
Paolo Bonzini <pbonzini@...hat.com>, Chao Gao <chao.gao@...el.com>
Subject: Re: [PATCH v5 3/5] KVM: nVMX: check vmcs12->guest_ia32_debugctl value
given by L2
On Wed, Jun 4, 2025 at 7:02 AM Sean Christopherson <seanjc@...gle.com> wrote:
>
> On Thu, May 22, 2025, Sean Christopherson wrote:
> > +Jim
> >
> > On Thu, May 22, 2025, Sean Christopherson wrote:
> > > On Wed, May 21, 2025, Maxim Levitsky wrote:
> > > > Check the vmcs12 guest_ia32_debugctl value before loading it, to avoid L2
> > > > being able to load arbitrary values to hardware IA32_DEBUGCTL.
> > > >
> > > > Reviewed-by: Chao Gao <chao.gao@...el.com>
> > > > Signed-off-by: Maxim Levitsky <mlevitsk@...hat.com>
> > > > ---
> > > > arch/x86/kvm/vmx/nested.c | 3 ++-
> > > > arch/x86/kvm/vmx/vmx.c | 2 +-
> > > > arch/x86/kvm/vmx/vmx.h | 1 +
> > > > 3 files changed, 4 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > > > index e073e3008b16..00f2b762710c 100644
> > > > --- a/arch/x86/kvm/vmx/nested.c
> > > > +++ b/arch/x86/kvm/vmx/nested.c
> > > > @@ -3146,7 +3146,8 @@ static int nested_vmx_check_guest_state(struct kvm_vcpu *vcpu,
> > > > return -EINVAL;
> > > >
> > > > if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS) &&
> > > > - CC(!kvm_dr7_valid(vmcs12->guest_dr7)))
> > > > + (CC(!kvm_dr7_valid(vmcs12->guest_dr7)) ||
> > > > + CC(vmcs12->guest_ia32_debugctl & ~vmx_get_supported_debugctl(vcpu, false))))
> > >
> > > This is a breaking change. For better or worse (read: worse), KVM's ABI is to
> > > drop BTF and LBR if they're unsupported (the former is always unsupported).
> > > Failure to honor that ABI means L1 can't excplitly load what it think is its
> > > current value into L2.
> > >
> > > I'll slot in a path to provide another helper for checking the validity of
> > > DEBUGCTL. I think I've managed to cobble together something that isn't too
> > > horrific (options are a bit limited due to the existing ugliness).
> >
> > And then Jim ruined my day. :-)
> >
> > As evidenced by this hilarious KUT testcase, it's entirely possible there are
> > existing KVM guests that are utilizing/abusing DEBUGCTL features.
> >
> > /*
> > * Optional RTM test for hardware that supports RTM, to
> > * demonstrate that the current volume 3 of the SDM
> > * (325384-067US), table 27-1 is incorrect. Bit 16 of the exit
> > * qualification for debug exceptions is not reserved. It is
> > * set to 1 if a debug exception (#DB) or a breakpoint
> > * exception (#BP) occurs inside an RTM region while advanced
> > * debugging of RTM transactional regions is enabled.
> > */
> > if (this_cpu_has(X86_FEATURE_RTM)) {
> > vmcs_write(ENT_CONTROLS,
> > vmcs_read(ENT_CONTROLS) | ENT_LOAD_DBGCTLS);
> > /*
> > * Set DR7.RTM[bit 11] and IA32_DEBUGCTL.RTM[bit 15]
> > * in the guest to enable advanced debugging of RTM
> > * transactional regions.
> > */
> > vmcs_write(GUEST_DR7, BIT(11));
> > vmcs_write(GUEST_DEBUGCTL, BIT(15));
> > single_step_guest("Hardware delivered single-step in "
> > "transactional region", starting_dr6, 0);
> > check_db_exit(false, false, false, &xbegin, BIT(16),
> > starting_dr6);
> > } else {
> > vmcs_write(GUEST_RIP, (u64)&skip_rtm);
> > enter_guest();
> > }
> >
> > For RTM specifically, disallowing DEBUGCTL.RTM but allowing DR7.RTM seems a bit
> > silly. Unless there's a security concern, that can probably be fixed by adding
> > support for RTM. Note, there's also a virtualization hole here, as KVM doesn't
> > vet DR7 beyond checking that bits 63:32 are zero, i.e. a guest could set DR7.RTM
> > even if KVM doesn't advertise support. Of course, closing that hole would require
> > completely dropping support for disabling DR interception, since VMX doesn't
> > give per-DR controls.
> >
> > For the other bits, I don't see a good solution. The only viable options I see
> > are to silently drop all unsupported bits (maybe with a quirk?), or enforce all
> > bits and cross our fingers that no L1 VMM is running guests with those bits set
> > in GUEST_DEBUGCTL.
>
> Paolo and I discussed this in PUCK this morning.
>
> We agree trying to close the DR7 virtualization hole would be futile, and that we
> should add support for DEBUGCTL.RTM to avoid breaking use of that specific bit.
>
> For the other DEBUGCTL bits, none of them actually work (although, somewhat
> amusingly, FREEZE_WHILE_SMM would "work" for real SMIs, which aren't visible to
> L1), so we're going to roll the dice, cross our fingers that no existing workload
> is setting those bits only in vmcs12.GUEST_DEBUGCTL, and enforce
> vmx_get_supported_debugctl() at VM-Enter without any quirk.
>
> 6 TR: Setting this bit to 1 enables branch trace messages to be sent.
> 7 BTS: Setting this bit enables branch trace messages (BTMs) to be logged in a BTS buffer.
> 8 BTINT: When clear, BTMs are logged in a BTS buffer in circular fashion.
> 9 BTS_OFF_OS: When set, BTS or BTM is skipped if CPL = 0.
> 10 BTS_OFF_USR: When set, BTS or BTM is skipped if CPL > 0.
> 11 FREEZE_LBRS_ON_PMI: When set, the LBR stack is frozen on a PMI request.
> 12 FREEZE_PERFMON_ON_PMI: When set, each ENABLE bit of the global counter control MSR are frozen on a PMI request.
> 13 ENABLE_UNCORE_PMI: When set, enables the logical processor to receive and generate PMI on behalf of the uncore.
> 14 FREEZE_WHILE_SMM: When set, freezes perfmon and trace messages while in SMM.
I have encountered an existing workload that sets FREEZE_WHILE_SMM,
and this change wantonly breaks it.
KVM should continue to allow legacy workloads to (perhaps
ineffectively) set FREEZE_WHILE_SMM under a quirk .
Powered by blists - more mailing lists