linux-kernel - Re: [PATCH RFC v1 00/10] KVM: nVMX: Use vmcs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87tu86um4i.fsf@redhat.com>
Date:   Mon, 27 Jun 2022 18:06:21 +0200
From:   Vitaly Kuznetsov <vkuznets@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     kvm@...r.kernel.org, Paolo Bonzini <pbonzini@...hat.com>,
        Anirudh Rayabharam <anrayabh@...ux.microsoft.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Maxim Levitsky <mlevitsk@...hat.com>,
        linux-hyperv@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC v1 00/10] KVM: nVMX: Use vmcs_config for setting up
 nested VMX MSRs

Sean Christopherson <seanjc@...gle.com> writes:

> On Wed, Jun 22, 2022, Vitaly Kuznetsov wrote:
>> vmcs_config is a sanitized version of host VMX MSRs where some controls are
>> filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are 
>> discovered, some inconsistencies in controls are detected,...) but
>> nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
>> in exposing undesired controls to L1. Switch to using vmcs_config instead.
>> 
>> RFC part: vmcs_config's sanitization now is a mix of "what can't be enabled"
>> and "what KVM doesn't want" and we need to separate these as for nested VMX
>> MSRs only the first category makes sense. This gives vmcs_config a slightly
>> different meaning "controls which can be (theoretically) used". An alternative
>> approach would be to store sanitized host MSRs values separately, sanitize
>> them and and use in nested_vmx_setup_ctls_msrs() but currently I don't see
>> any benefits. Comments welcome!
>
> I like the idea overall, even though it's a decent amount of churn.  It seems
> easier to maintain than separate paths for nested.  The alternative would be to
> add common helpers to adjust the baseline configurations, but I don't see any
> way to programmatically make that approach more robust.
>
> An idea to further harden things.  Or: an excuse to extend macro shenanigans :-)
>
> If we throw all of the "opt" and "min" lists into macros, e.g. KVM_REQUIRED_*
> and KVM_OPTIONAL_*, and then use those to define a KVM_KNOWN_* field, we can
> prevent using the mutators to set/clear unknown bits at runtime via BUILD_BUG_ON().
> The core builders, e.g. vmx_exec_control(), can still set unknown bits, i.e. set
> bits that aren't enumerated to L1, but that's easier to audit and this would catch
> regressions for the issues fixed in patches.
>
> It'll required making add_atomic_switch_msr_special() __always_inline (or just
> open code it), but that's not a big deal.
>
> E.g. if we have
>
>   #define KVM_REQUIRED_CPU_BASED_VM_EXEC_CONTROL <blah blah blah>
>   #define KVM_OPTIONAL_CPU_BASED_VM_EXEC_CONTROL <blah blah blah>
>
> Then the builders for the controls shadows can do:
>
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 286c88e285ea..5eb75822a09e 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -468,6 +468,8 @@ static inline u8 vmx_get_rvi(void)
>  }
>  
>  #define BUILD_CONTROLS_SHADOW(lname, uname, bits)                              \
> +#define KVM_KNOWN_ ## uname                                                    \
> +       (KVM_REQUIRED_ ## uname | KVM_OPTIONAL_ ## uname)                       \

I'm certainly not a macro jedi and I failed to make this compile as gcc
hates when I put '#define's in macros but I made a simpler version with
(presumeably) the same outcome. v1 is out, thanks for the suggestion!

>  static inline void lname##_controls_set(struct vcpu_vmx *vmx, u##bits val)     \
>  {                                                                              \
>         if (vmx->loaded_vmcs->controls_shadow.lname != val) {                   \
> @@ -485,10 +487,12 @@ static inline u##bits lname##_controls_get(struct vcpu_vmx *vmx)          \
>  }                                                                              \
>  static inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits val)  \
>  {                                                                              \
> +       BUILD_BUG_ON(!(val & KVM_KNOWN_ ## uname));                             \
>         lname##_controls_set(vmx, lname##_controls_get(vmx) | val);             \
>  }                                                                              \
>  static inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits val)        \
>  {                                                                              \
> +       BUILD_BUG_ON(!(val & KVM_KNOWN_ ## uname));                             \
>         lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val);            \
>  }
>  BUILD_CONTROLS_SHADOW(vm_entry, VM_ENTRY_CONTROLS, 32)
>
>
>
> Handling the controls that are restricted to CONFIG_X86_64=y will be midly annoying,
> but adding a base set isn't too bad, e.g.
>
> #define __KVM_REQUIRED_CPU_BASED_VM_EXEC_CONTROL <blah blah blah>
> #ifdef CONFIG_X86_64
> #define KVM_REQUIRED_CPU_BASED_VM_EXEC_CONTROL (__KVM_REQUIRED_CPU_BASED_VM_EXEC_CONTROL | \
> 						CPU_BASED_CR8_LOAD_EXITING |		   \
> 						CPU_BASED_CR8_STORE_EXITING)
> #else
> #define KVM_REQUIRED_CPU_BASED_VM_EXEC_CONTROL __KVM_REQUIRED_CPU_BASED_VM_EXEC_CONTROL
> #endif
>

-- 
Vitaly