lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <14eab14d368e68cb9c94c655349f94f44a9a15b4.camel@redhat.com>
Date: Thu, 01 May 2025 16:35:07 -0400
From: mlevitsk@...hat.com
To: Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>, Borislav
 Petkov <bp@...en8.de>, Paolo Bonzini <pbonzini@...hat.com>, x86@...nel.org,
 Dave Hansen <dave.hansen@...ux.intel.com>, Ingo Molnar <mingo@...hat.com>, 
 linux-kernel@...r.kernel.org, "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH 1/3] x86: KVM: VMX: Wrap GUEST_IA32_DEBUGCTL read/write
 with access functions

On Tue, 2025-04-22 at 16:33 -0700, Sean Christopherson wrote:
> On Tue, Apr 15, 2025, Maxim Levitsky wrote:
> > Instead of reading and writing GUEST_IA32_DEBUGCTL vmcs field directly,
> > wrap the logic with get/set functions.
> 
> Why?  I know why the "set" helper is being added, but it needs to called out.
> 
> Please omit the getter entirely, it does nothing more than obfuscate a very
> simple line of code.

In this patch yes. But in the next patch I switch to reading from 'vmx->msr_ia32_debugctl'
You want me to open code this access? I don't mind, if you insist.

> 
> > Also move the checks that the guest's supplied value is valid to the new
> > 'set' function.
> 
> Please do this in a separate patch.  There's no need to mix refactoring and
> functional changes.

I thought that it was natural to do this in a the same patch. In this patch I introduce
a 'vmx_set_guest_debugctl' which should be used any time we set the msr given
the guest value, and VM entry is one of these cases.

I can split this if you want.

> 
> > In particular, the above change fixes a minor security issue in which L1
> 
> Bug, yes.  Not sure it constitutes a meaningful security issue though.

I also think so, but I wanted to mention this just in case.

> 
> > hypervisor could set the GUEST_IA32_DEBUGCTL, and eventually the host's
> > MSR_IA32_DEBUGCTL
> 
> No, the lack of a consistency check allows the guest to set the MSR in hardware,
> but that is not the host's value.

That's what I meant - the guest can set the real hardware MSR. Yes, after the
guest exits, the OS value is restored. I'll rephrase this in v2.

> 
> > to any value by performing a VM entry to L2 with VM_ENTRY_LOAD_DEBUG_CONTROLS
> > set.
> 
> Any *legal* value.  Setting completely unsupported bits will result in VM-Enter
> failing with a consistency check VM-Exit.

True.

> 
> > Signed-off-by: Maxim Levitsky <mlevitsk@...hat.com>
> > ---
> >  arch/x86/kvm/vmx/nested.c    | 15 +++++++---
> >  arch/x86/kvm/vmx/pmu_intel.c |  9 +++---
> >  arch/x86/kvm/vmx/vmx.c       | 58 +++++++++++++++++++++++-------------
> >  arch/x86/kvm/vmx/vmx.h       |  3 ++
> >  4 files changed, 57 insertions(+), 28 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index e073e3008b16..b7686569ee09 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -2641,6 +2641,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
> >  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> >  	struct hv_enlightened_vmcs *evmcs = nested_vmx_evmcs(vmx);
> >  	bool load_guest_pdptrs_vmcs12 = false;
> > +	u64 new_debugctl;
> >  
> >  	if (vmx->nested.dirty_vmcs12 || nested_vmx_is_evmptr12_valid(vmx)) {
> >  		prepare_vmcs02_rare(vmx, vmcs12);
> > @@ -2653,11 +2654,17 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
> >  	if (vmx->nested.nested_run_pending &&
> >  	    (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) {
> >  		kvm_set_dr(vcpu, 7, vmcs12->guest_dr7);
> > -		vmcs_write64(GUEST_IA32_DEBUGCTL, vmcs12->guest_ia32_debugctl);
> > +		new_debugctl = vmcs12->guest_ia32_debugctl;
> >  	} else {
> >  		kvm_set_dr(vcpu, 7, vcpu->arch.dr7);
> > -		vmcs_write64(GUEST_IA32_DEBUGCTL, vmx->nested.pre_vmenter_debugctl);
> > +		new_debugctl = vmx->nested.pre_vmenter_debugctl;
> >  	}
> > +
> > +	if (CC(!vmx_set_guest_debugctl(vcpu, new_debugctl, false))) {
> 
> The consistency check belongs in nested_vmx_check_guest_state(), only needs to
> check the VM_ENTRY_LOAD_DEBUG_CONTROLS case, and should be posted as a separate
> patch.

I can move it there. Can you explain why though you want this? Is it because of the
order of checks specified in the PRM?

Currently GUEST_IA32_DEBUGCTL of the host is *written* in prepare_vmcs02. 
Should I also move this write to nested_vmx_check_guest_state?

Or should I write the value blindly in prepare_vmcs02 and then check the value
of 'vmx->msr_ia32_debugctl' in nested_vmx_check_guest_state and fail if the value
contains reserved bits? 
I don't like that idea that much IMHO.


> 
> > +		*entry_failure_code = ENTRY_FAIL_DEFAULT;
> > +		return -EINVAL;
> > +	}
> > +
> > +static void __vmx_set_guest_debugctl(struct kvm_vcpu *vcpu, u64 data)
> > +{
> > +	vmcs_write64(GUEST_IA32_DEBUGCTL, data);
> > +}
> > +
> > +bool vmx_set_guest_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated)
> > +{
> > +	u64 invalid = data & ~vmx_get_supported_debugctl(vcpu, host_initiated);
> > +
> > +	if (invalid & (DEBUGCTLMSR_BTF|DEBUGCTLMSR_LBR)) {
> > +		kvm_pr_unimpl_wrmsr(vcpu, MSR_IA32_DEBUGCTLMSR, data);
> > +		data &= ~(DEBUGCTLMSR_BTF|DEBUGCTLMSR_LBR);
> > +		invalid &= ~(DEBUGCTLMSR_BTF|DEBUGCTLMSR_LBR);
> > +	}
> > +
> > +	if (invalid)
> > +		return false;
> > +
> > +	if (is_guest_mode(vcpu) && (get_vmcs12(vcpu)->vm_exit_controls &
> > +					VM_EXIT_SAVE_DEBUG_CONTROLS))
> > +		get_vmcs12(vcpu)->guest_ia32_debugctl = data;
> > +
> > +	if (intel_pmu_lbr_is_enabled(vcpu) && !to_vmx(vcpu)->lbr_desc.event &&
> > +	    (data & DEBUGCTLMSR_LBR))
> > +		intel_pmu_create_guest_lbr_event(vcpu);
> > +
> > +	__vmx_set_guest_debugctl(vcpu, data);
> > +	return true;
> 
> Return 0/-errno, not true/false.

There are plenty of functions in this file and KVM that return boolean.

e.g: 

static bool nested_vmx_check_eptp(struct kvm_vcpu *vcpu, u64 new_eptp)
static inline bool vmx_control_verify(u32 control, u32 low, u32 high)
static bool nested_evmcs_handle_vmclear(struct kvm_vcpu *vcpu, gpa_t vmptr)

static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
						 struct vmcs12 *vmcs12)


static bool nested_vmx_check_eptp(struct kvm_vcpu *vcpu, u64 new_eptp)
static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)

...


I personally think that functions that emulate hardware should return boolean values
or some hardware specific status code (e.g VMX failure code) because the real hardware
never returns -EINVAL and such.


Best regards,
	Maxim Levitsky




> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ