linux-kernel - Re: [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors in nested setups

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aAkNN029DIxYay-j@google.com>
Date: Wed, 23 Apr 2025 08:54:31 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Zack Rusin <zack.rusin@...adcom.com>
Cc: Xin Li <xin@...or.com>, linux-kernel@...r.kernel.org, 
	Doug Covelli <doug.covelli@...adcom.com>, Paolo Bonzini <pbonzini@...hat.com>, 
	Jonathan Corbet <corbet@....net>, Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, 
	Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org, 
	"H. Peter Anvin" <hpa@...or.com>, kvm@...r.kernel.org, linux-doc@...r.kernel.org
Subject: Re: [PATCH v2 4/5] KVM: x86: Add support for legacy VMware backdoors
 in nested setups

On Wed, Apr 23, 2025, Zack Rusin wrote:
> On Wed, Apr 23, 2025 at 9:31 AM Sean Christopherson <seanjc@...gle.com> wrote:
> > Heh, KVM_CAP_EXIT_ON_EMULATION_FAILURE is the odd one out.  Even if that weren't
> > the case, this is one of the situations where diverging from the existing code is
> > desirable, because the existing code is garbage.
> >
> > arch/x86/kvm/x86.c:             if (cap->args[0] & ~kvm_caps.supported_quirks)
> > arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_X2APIC_API_VALID_FLAGS)
> > arch/x86/kvm/x86.c:             if (cap->args[0] & ~kvm_get_allowed_disable_exits())
> > arch/x86/kvm/x86.c:                 (cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE))
> > arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_MSR_EXIT_REASON_VALID_MASK)
> > arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_BUS_LOCK_DETECTION_VALID_MODE)
> > arch/x86/kvm/x86.c:             if (cap->args[0] & ~KVM_EXIT_HYPERCALL_VALID_MASK) {
> > arch/x86/kvm/x86.c:             if (cap->args[0] & ~1)
> > arch/x86/kvm/x86.c:             if (!enable_pmu || (cap->args[0] & ~KVM_CAP_PMU_VALID_MASK))
> > arch/x86/kvm/x86.c:             if ((u32)cap->args[0] & ~KVM_X86_NOTIFY_VMEXIT_VALID_BITS)
> > virt/kvm/kvm_main.c:            if (cap->flags || (cap->args[0] & ~allowed_options))
> 
> That's because none of those other options are boolean, right? I
> assumed that the options that have valid masks use defines but
> booleans use ~1 because (val & ~1) makes it obvious to the reader that
> the option is in fact a boolean in a way that (val &
> ~KVM_SOME_VALID_BITS) can not.

The entire reason when KVM checks and enforces cap->args[0] is so that KVM can
expand the capability's functionality in the future.  Whether or not a capability
is *currently* a boolean, i.e. only has one supported flag, is completely irrelevant.

KVM has burned itself many times over by not performing checks, e.g. is how we
ended up with things like KVM_CAP_DISABLE_QUIRKS2.

> > > Or are you saying that since I'm already there you'd like to see a
> > > completely separate patch that defines some kind of IS_ZERO_OR_ONE
> > > macro for KVM, use it for KVM_CAP_EXIT_ON_EMULATION_FAILURE and, once
> > > that lands then I can make use of it in this series?
> >
> > Xin is suggesting that you add a macro in arch/x86/include/uapi/asm/kvm.h to
> > #define which bits are valid and which bits are reserved.
> >
> > At a glance, you can kill multiple birds with one stone.  Rather than add three
> > separate capabilities, add one capability and then a variety of flags.  E.g.
> >
> > #define KVM_X86_VMWARE_HYPERCALL        _BITUL(0)
> > #define KVM_X86_VMWARE_BACKDOOR         _BITUL(1)
> > #define KVM_X86_VMWARE_NESTED_BACKDOOR  _BITUL(2)
> > #define KVM_X86_VMWARE_VALID_FLAGS      (KVM_X86_VMWARE_HYPERCALL |
> >                                          KVM_X86_VMWARE_BACKDOOR |
> >                                          KVM_X86_VMWARE_NESTED_BACKDOOR)
> >
> >         case KVM_CAP_X86_VMWARE_EMULATION:
> >                 r = -EINVAL;
> >                 if (cap->args[0] & ~KVM_X86_VMWARE_VALID_FLAGS)
> >                         break;
> >
> >                 mutex_lock(&kvm->lock);
> >                 if (!kvm->created_vcpus) {
> >                         if (cap->args[0] & KVM_X86_VMWARE_HYPERCALL)
> >                                 kvm->arch.vmware.hypercall_enabled = true;
> >                         if (cap->args[0] & KVM_X86_VMWARE_BACKDOOR)
> >                                 kvm->arch.vmware.backdoor_enabled;
> >                         if (cap->args[0] & KVM_X86_VMWARE_NESTED_BACKDOOR)
> >                                 kvm->arch.vmware.nested_backdoor_enabled = true;
> >                         r = 0;
> >                 }
> >                 mutex_unlock(&kvm->lock);
> >                 break;
> >
> > That approach wouldn't let userspace disable previously enabled VMware capabilities,
> > but unless there's a use case for doing so, that should be a non-issue.
> 
> I'd say that if we desperately want to use a single cap for all of
> these then I'd probably prefer a different approach because this would
> make vmware_backdoor_enabled behavior really wacky.

How so?  If kvm.enable_vmware_backdoor is true, then the backdoor is enabled
for all VMs, else it's disabled by default but can be enabled on a per-VM basis
by the new capability.

> It's the one that currently can only be set via kernel boot flags, so having
> systems where the boot flag is on and disabling it on a per-vm basis makes
> sense and breaks with this.

We could go this route, e.g. KVM does something similar for PMU virtualization.
But the key difference is that enable_pmu is enabled by default, whereas
enable_vmware_backdoor is disabled by default.  I.e. it makes far more sense for
the capability to let userspace opt-in, as opposed to opt-out.

> I'd probably still write the code to be able to disable/enable all of them
> because it makes sense for vmware_backdoor_enabled.

Again, that's not KVM's default, and it will never be KVM's default.  Unless there
is a concrete use case for disabling features after *userspace* has opted-in,
just make them one-way 0=>1 transitions so as to keep KVM's implementation as
simple as possible.