lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d3f4ae0dafb9a3ac3fcd7cec5de3dc4285896b82.camel@redhat.com>
Date: Wed, 24 Jul 2024 13:28:20 -0400
From: Maxim Levitsky <mlevitsk@...hat.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Vitaly Kuznetsov
 <vkuznets@...hat.com>,  kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
 Hou Wenlong <houwenlong.hwl@...group.com>, Kechen Lu <kechenl@...dia.com>,
 Oliver Upton <oliver.upton@...ux.dev>, Binbin Wu
 <binbin.wu@...ux.intel.com>, Yang Weijiang <weijiang.yang@...el.com>,
 Robert Hoo <robert.hoo.linux@...il.com>
Subject: Re: [PATCH v2 03/49] KVM: x86: Account for KVM-reserved CR4 bits
 when passing through CR4 on VMX

On Tue, 2024-07-09 at 12:58 -0700, Sean Christopherson wrote:
> On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> > On Fri, 2024-05-17 at 10:38 -0700, Sean Christopherson wrote:
> > > Drop x86.c's local pre-computed cr4_reserved bits and instead fold KVM's
> > > reserved bits into the guest's reserved bits.  This fixes a bug where VMX's
> > > set_cr4_guest_host_mask() fails to account for KVM-reserved bits when
> > > deciding which bits can be passed through to the guest.  In most cases,
> > > letting the guest directly write reserved CR4 bits is ok, i.e. attempting
> > > to set the bit(s) will still #GP, but not if a feature is available in
> > > hardware but explicitly disabled by the host, e.g. if FSGSBASE support is
> > > disabled via "nofsgsbase".
> > > 
> > > Note, the extra overhead of computing host reserved bits every time
> > > userspace sets guest CPUID is negligible.  The feature bits that are
> > > queried are packed nicely into a handful of words, and so checking and
> > > setting each reserved bit costs in the neighborhood of ~5 cycles, i.e. the
> > > total cost will be in the noise even if the number of checked CR4 bits
> > > doubles over the next few years.  In other words, x86 will run out of CR4
> > > bits long before the overhead becomes problematic.
> > 
> > It might be just me, but IMHO this justification is confusing, leading me to
> > belive that maybe the code is on the hot-path instead.
> > 
> > The right justification should be just that this code is in
> > kvm_vcpu_after_set_cpuid is usually (*) only called once per vCPU (twice
> > after your patch #1)
> 
> Ya.  I was trying to capture that even if that weren't true, i.e. even if userspace
> was doing something odd, that the extra cost is irrelevant.  I'll expand and reword
> the paragraph to make it clear this isn't a hot path for any sane userspace.
Thank you!

> 
> > (*) Qemu also calls it, each time vCPU is hotplugged but this doesn't change
> > anything performance wise.
> 
> ...
> 
> > > @@ -9831,10 +9826,6 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
> > >  	if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
> > >  		kvm_caps.supported_xss = 0;
> > >  
> > > -#define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
> > > -	cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_);
> > > -#undef __kvm_cpu_cap_has
> > > -
> > >  	if (kvm_caps.has_tsc_control) {
> > >  		/*
> > >  		 * Make sure the user can only configure tsc_khz values that
> > 
> > I mostly agree with this patch - caching always carries risks and when it doesn't
> > value performance wise, it should always be removed.
> > 
> > 
> > However I don't think that this patch fixes a bug as it claims:
> > 
> > This is the code prior to this patch:
> > 
> > kvm_x86_vendor_init ->
> > 
> > 	r = ops->hardware_setup();
> > 		svm_hardware_setup
> > 			svm_set_cpu_caps + kvm_set_cpu_caps
> > 
> > 		-- or --
> > 
> > 		vmx_hardware_setup ->
> > 			vmx_set_cpu_caps + + kvm_set_cpu_caps
> > 
> > 
> > 	# read from 'kvm_cpu_caps'
> > 	cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_);
> > 
> > 
> > AFAIK kvm cpu caps are never touched outside of svm_set_cpu_caps/vmx_hardware_setup
> > (they don't depend on some later post-processing, cpuid, etc).
> > 
> > In fact a good refactoring would to make kvm_cpu_caps const after this point,
> > using cast, assert or something like that.
> > 
> > This leads me to believe that cr4_reserved_bits is computed correctly.
> 
> cr4_reserved_bits is computed correctly.  The bug is that cr4_reserved_bits isn't
> consulted by set_cr4_guest_host_mask(), which is what I meant by "KVM-reserved
> bits" in the changelog.

Ah, I see it now.

I also see that set_cr4_guest_host_mask, limits the guest owned bits to a small
whitelist, and none of these bits looks scary, so it all make sense that this
is mostly a theoretical bug, but for sure worth fixing.


> 
> > I could be wrong, but then IMHO it is a very good idea to provide an explanation
> > on how this bug can happen.
> 
> The first paragraph of the changelog tries to do that, and I'm struggling to come
> up with different wording that makes it more clear what's wrong.  Any ideas/suggestions?
> 

I also re-read it, and now it all makes sense. I guess I just somehow got fixed on
thinking that cr4_reserved_bits was not computed incorrectly rather than just
not used.
The comment indeed now makes sense to me, so let it be as it is.


Reviewed-by: Maxim Levitsky <mlevitsk@...hat.com>

Best regards,
	Maxim Levitsky



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ