[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YwPLt2e7CuqMzjt1@google.com>
Date: Mon, 22 Aug 2022 18:32:23 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Vitaly Kuznetsov <vkuznets@...hat.com>
Cc: kvm@...r.kernel.org, Paolo Bonzini <pbonzini@...hat.com>,
Anirudh Rayabharam <anrayabh@...ux.microsoft.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Maxim Levitsky <mlevitsk@...hat.com>,
Nathan Chancellor <nathan@...nel.org>,
Michael Kelley <mikelley@...rosoft.com>,
linux-hyperv@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 03/26] x86/hyperv: Update 'struct hv_enlightened_vmcs'
definition
On Mon, Aug 22, 2022, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@...gle.com> writes:
>
> > On Mon, Aug 22, 2022, Vitaly Kuznetsov wrote:
> >> My initial implementation was inventing 'eVMCS revision' concept:
> >> https://lore.kernel.org/kvm/20220629150625.238286-7-vkuznets@redhat.com/
> >>
> >> which is needed if we don't gate all these new fields on CPUID.0x4000000A.EBX BIT(0).
> >>
> >> Going forward, we will still (likely) need something when new fields show up.
> >
> > My comments from that thread still apply. Adding "revisions" or feature flags
> > isn't maintanable, e.g. at best KVM will end up with a ridiculous number of flags.
> >
> > Looking at QEMU, which I strongly suspect is the only VMM that enables
> > KVM_CAP_HYPERV_ENLIGHTENED_VMCS, it does the sane thing of enabling the capability
> > before grabbing the VMX MSRs.
> >
> > So, why not simply apply filtering for host accesses as well?
>
> (I understand that using QEMU to justify KVM's behavior is flawed but...)
>
> QEMU's migration depends on the assumption that identical QEMU's command
> lines create identical (from guest PoV) configurations. Assume we have
> (simplified)
>
> "-cpu CascadeLake-Sever,hv-evmcs"
>
> on both source and destination but source host is newer, i.e. its KVM
> knows about TSC Scaling in eVMCS and destination host has no idea about
> it. If we just apply filtering upon vCPU creation, guest visible MSR
> values are going to be different, right? Ok, assuming QEMU also migrates
> VMX feature MSRs (TODO: check if that's true), we will be able to fail
> mirgration late (which is already much worse than not being able to
> create the desired configuration on destination, 'fail early') if we use
> in-KVM filtering to throw an error to userspace. But if we blindly
> filter control MSRs on the destination, 'TscScaling' will just disapper
> undreneath the guest. This is unlikely to work.
But all of that holds true irrespetive of eVMCS. If QEMU attempts to migrate a
nested guest from a KVM that supports TSC_SCALING to a KVM that doesn't support
TSC_SCALING, then TSC_SCALING is going to disappear and VM-Entry on the dest will
fail. I.e. QEMU _must_ be able to detect the incompatibility and not attempt
the migration. And with that code in place, QEMU doesn't need to do anything new
for eVMCS, it Just Works.
> In any case, what we need, is an option for VMM (read: QEMU) to create
> the configuration with 'TscScaling' filtered out even KVM supports the
> bit in eVMCS. This way the guest will be able to migrate backwards to an
> older KVM which doesn't support it, i.e.
>
> '-cpu CascadeLake-Sever,hv-evmcs'
> creates the 'origin' eVMCS configuration, no TscScaling
>
> '-cpu CascadeLake-Sever,hv-evmcs,hv-evmcs-2022' creates the updated one.
Again, this conundrum exists irrespective of eVMCS. Properly solve the problem
for regular nVMX and eVMCS should naturally work.
> KVM_CAP_HYPERV_ENLIGHTENED_VMCS is bad as it only takes 'eVMCS' version
> as a parameter (as we assumed it will always change when new fields are
> added, but that turned out to be false). That's why I suggested
> KVM_CAP_HYPERV_ENLIGHTENED_VMCS2.
Enumerating features via versions is such a bad API though, e.g. if there's a
bug with nested TSC_SCALING, userspace can't disable just nested TSC_SCALING
without everything else under the inscrutable "hv-evmcs-2022" being collateral
damage.
Powered by blists - more mailing lists