[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221110084418.t7iv5zlfgiu77gfn@linux.intel.com>
Date: Thu, 10 Nov 2022 16:44:19 +0800
From: Yu Zhang <yu.c.zhang@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, Eric Li <ercli@...avis.edu>,
David Matlack <dmatlack@...gle.com>,
Oliver Upton <oupton@...gle.com>,
Liu Jingqi <jingqi.liu@...el.com>
Subject: Re: [PATCH v5 05/15] KVM: nVMX: Let userspace set nVMX MSR to any
_host_ supported value
>
> No. Again, KVM _should never_ manipulate VMX MSRs in response to CPUID changes.
> Keeping the existing behavior would be done purely to maintain backwards
> compability with existing userspace, not because it's strictly the right thing to do.
>
> E.g. as a strawman, a weird userspace could do KVM_SET_MSRS => KVM_SET_CPUID =>
> KVM_SET_CPUID, where the first KVM_SET_CPUID reset to a base config and the second
> KVM_SET_CPUID incorporates "optional" features. In that case, clearing bits in
> the VMX MSRs on the first KVM_SET_CPUID would do the wrong thing if the second
> KVM_SET_CPUID enabled the relevant features.
>
> AFAIK, no userspace actually does something odd like that, whereas there are VMMs
> that do KVM_SET_MSRS before KVM_SET_CPUID, e.g. disable a feature in VMX MSRs but
> later enable the feature in CPUID for L1. And so disabling features is likely
> safe-ish, but enabling feature most definitely can cause problems for userspace.
>
> Hrm, actually, there are likely older VMMs that never set VMX MSRs, and so dropping
> the "enable features" code might not be safe either. Grr. The obvious solution
> would be to add a quirk, but maybe we can avoid a quirk by skipping KVM's
> misguided updates if userspace has set the MSR. That should work for a userspace
> that deliberately sets the MSR during setup, and for a userspace that blindly
> migrates the MSR since the migrated value should already be correct/sane.
>
Oh. Just saw your new selftest code, and fininally get your point(I hope
so...). Thanks!
> > BTW, I found my previous understanding of what vmx_adjust_secondary_exec_control()
> > currently does was also wrong. It could also be used for EXITING controls. And
> > for such flags(e.g., SECONDARY_EXEC_RDRAND_EXITING), values for the nested settings
> > (vmx->nested.msrs.secondary_ctls_high) and for the L1 execution controls(*exec_control)
> > could be opposite. So the statement:
> > "1> For now, what vmx_adjust_secondary_exec_control() does, is to enable/
> > disable a feature in VMX MSR(and nVMX MSR) based on cpuid changes."
> > is wrong.
>
> No, it's correct. The EXITING controls are just inverted feature flags. E.g. if
> RDRAND is disabled in CPUID, KVM sets the EXITING control so that KVM intercepts
> RDRAND in order to inject #UD.
>
> [EXIT_REASON_RDRAND] = kvm_handle_invalid_op,
>
Well, suppose
- cpu_has_vmx_rdrand() is true;
- meanwhile guest_cpuid_has(vcpu, X86_FEATURE_RDRAND) is false.
And then, what vmx_adjust_secondary_exec_control() currently does is:
1> keep the SECONDARY_EXEC_RDRAND_EXITING set in L1 secondary proc-
based execution control.
2> and then clear the SECONDARY_EXEC_RDRAND_EXITING in the high bits
of IA32_VMX_PROCBASED_CTLS2 MSR for nested by
vmx->nested.msrs.secondary_ctls_high &= ~control;
That means for L1 VMM, SECONDARY_EXEC_RDRAND_EXITING must be cleared
in its(VMCS12's) secondary proc-based VM-execution control, even when
rdrand is disabled in L1's and L2's CPUID.
I wonder, for native environment, if an instruction is not supported,
will the allowed 1-setting for its corresponding exiting feature in
IA32_VMX_PROCBASED_CTLS2 MSR be set, or be cleared? Maybe it should
be cleared, and executing such instruction in non-root will just get
a #UD directly instead of triggering a VM-Exit?
Note: I do not think this will cause any problem, just curious if L1
VMM can observe a behavior that's not supposed to be in native scenario(
only because what we are doing in KVM).
B.R.
Yu
Powered by blists - more mailing lists