[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7df23a0d-e2a6-71a7-7641-6363f4905f5c@intel.com>
Date: Sun, 6 Aug 2023 16:44:34 +0800
From: "Yang, Weijiang" <weijiang.yang@...el.com>
To: Sean Christopherson <seanjc@...gle.com>,
Chao Gao <chao.gao@...el.com>
CC: <pbonzini@...hat.com>, <peterz@...radead.org>,
<john.allen@....com>, <kvm@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <rick.p.edgecombe@...el.com>,
<binbin.wu@...ux.intel.com>
Subject: Re: [PATCH v5 11/19] KVM:VMX: Emulate read and write to CET MSRs
On 8/5/2023 5:27 AM, Sean Christopherson wrote:
> On Fri, Aug 04, 2023, Chao Gao wrote:
>> On Thu, Aug 03, 2023 at 12:27:24AM -0400, Yang Weijiang wrote:
>>> Add emulation interface for CET MSR read and write.
>>> The emulation code is split into common part and vendor specific
>>> part, the former resides in x86.c to benefic different x86 CPU
>>> vendors, the latter for VMX is implemented in this patch.
>>>
>>> Signed-off-by: Yang Weijiang <weijiang.yang@...el.com>
>>> ---
>>> arch/x86/kvm/vmx/vmx.c | 27 +++++++++++
>>> arch/x86/kvm/x86.c | 104 +++++++++++++++++++++++++++++++++++++----
>>> arch/x86/kvm/x86.h | 18 +++++++
>>> 3 files changed, 141 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>>> index 6aa76124e81e..ccf750e79608 100644
>>> --- a/arch/x86/kvm/vmx/vmx.c
>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>> @@ -2095,6 +2095,18 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>> else
>>> msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
>>> break;
>>> + case MSR_IA32_S_CET:
>>> + case MSR_KVM_GUEST_SSP:
>>> + case MSR_IA32_INT_SSP_TAB:
>>> + if (kvm_get_msr_common(vcpu, msr_info))
>>> + return 1;
>>> + if (msr_info->index == MSR_KVM_GUEST_SSP)
>>> + msr_info->data = vmcs_readl(GUEST_SSP);
>>> + else if (msr_info->index == MSR_IA32_S_CET)
>>> + msr_info->data = vmcs_readl(GUEST_S_CET);
>>> + else if (msr_info->index == MSR_IA32_INT_SSP_TAB)
>>> + msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
>> This if-else-if suggests that they are focibly grouped together to just
>> share the call of kvm_get_msr_common(). For readability, I think it is better
>> to handle them separately.
>>
>> e.g.,
>> case MSR_IA32_S_CET:
>> if (kvm_get_msr_common(vcpu, msr_info))
>> return 1;
>> msr_info->data = vmcs_readl(GUEST_S_CET);
>> break;
>>
>> case MSR_KVM_GUEST_SSP:
>> if (kvm_get_msr_common(vcpu, msr_info))
>> return 1;
>> msr_info->data = vmcs_readl(GUEST_SSP);
>> break;
> Actually, we can do even better. We have an existing framework for these types
> of prechecks, I just completely forgot about it :-( (my "look at PAT" was a bad
> suggestion).
>
> Handle the checks in __kvm_set_msr() and __kvm_get_msr(), i.e. *before* calling
> into vendor code. Then vendor code doesn't need to make weird callbacks.
I see, will change it, thank you!
>>> int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>> {
>>> u32 msr = msr_info->index;
>>> @@ -3981,6 +4014,45 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>> vcpu->arch.guest_fpu.xfd_err = data;
>>> break;
>>> #endif
>>> +#define CET_EXCLUSIVE_BITS (CET_SUPPRESS | CET_WAIT_ENDBR)
>>> +#define CET_CTRL_RESERVED_BITS GENMASK(9, 6)
> Please use a single namespace for these #defines, e.g. CET_CTRL_* or maybe
> CET_US_* for everything.
OK.
>>> +#define CET_SHSTK_MASK_BITS GENMASK(1, 0)
>>> +#define CET_IBT_MASK_BITS (GENMASK_ULL(5, 2) | \
>>> + GENMASK_ULL(63, 10))
>>> +#define CET_LEG_BITMAP_BASE(data) ((data) >> 12)
> Bah, stupid SDM. Please spell out "LEGACY", I though "LEG" was short for "LEGAL"
> since this looks a lot like a page shift, i.e. getting a pfn.
Sure :-)
>>> +static bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu,
>>> + struct msr_data *msr)
>>> +{
>>> + if (is_shadow_stack_msr(msr->index)) {
>>> + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
>>> + return false;
>>> +
>>> + if (msr->index == MSR_KVM_GUEST_SSP)
>>> + return msr->host_initiated;
>>> +
>>> + return msr->host_initiated ||
>>> + guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
>>> + }
>>> +
>>> + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
>>> + !kvm_cpu_cap_has(X86_FEATURE_IBT))
>>> + return false;
>>> +
>>> + return msr->host_initiated ||
>>> + guest_cpuid_has(vcpu, X86_FEATURE_IBT) ||
>>> + guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
> Similar to my suggestsion for XSS, I think we drop the waiver for host_initiated
> accesses, i.e. require the feature to be enabled and exposed to the guest, even
> for the host.
I saw Paolo shares different opinion on this, so would hold on for a while...
>>> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
>>> index c69fc027f5ec..3b79d6db2f83 100644
>>> --- a/arch/x86/kvm/x86.h
>>> +++ b/arch/x86/kvm/x86.h
>>> @@ -552,4 +552,22 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size,
>>> unsigned int port, void *data, unsigned int count,
>>> int in);
>>>
>>> +/*
>>> + * Guest xstate MSRs have been loaded in __msr_io(), disable preemption before
>>> + * access the MSRs to avoid MSR content corruption.
>>> + */
>> I think it is better to describe what the function does prior to jumping into
>> details like where guest FPU is loaded.
OK, will do it, thanks!
>> /*
>> * Lock and/or reload guest FPU and access xstate MSRs. For accesses initiated
>> * by host, guest FPU is loaded in __msr_io(). For accesses initiated by guest,
>> * guest FPU should have been loaded already.
>> */
>>> +static inline void kvm_get_xsave_msr(struct msr_data *msr_info)
>>> +{
>>> + kvm_fpu_get();
>>> + rdmsrl(msr_info->index, msr_info->data);
>>> + kvm_fpu_put();
>>> +}
>>> +
>>> +static inline void kvm_set_xsave_msr(struct msr_data *msr_info)
>>> +{
>>> + kvm_fpu_get();
>>> + wrmsrl(msr_info->index, msr_info->data);
>>> + kvm_fpu_put();
>>> +}
>> Can you rename functions to kvm_get/set_xstate_msr() to align with the comment
>> and patch 6? And if there is no user outside x86.c, you can just put these two
>> functions right after the is_xstate_msr() added in patch 6.
OK, maybe I added the helpers in this patch duo to compilation error "function is defined but not used".
> +1. These should also assert that (a) guest FPU state is loaded and
Do you mean something like this:
WARN_ON_ONCE(!vcpu->arch.guest_fpu->in_use) orĀ KVM_BUG_ON()
added in the helpers?
> (b) the MSR
> is passed through to the guest. I might be ok dropping (b) if both VMX and SVM
> passthrough all MSRs if they're exposed to the guest, i.e. not lazily passed
> through.
I'm OK to add the assert if finally all the CET MSRs are passed through directly.
> Sans any changes to kvm_{g,s}et_xsave_msr(), I think this? (completely untested)
>
>
> ---
> arch/x86/kvm/vmx/vmx.c | 34 +++-------
> arch/x86/kvm/x86.c | 151 +++++++++++++++--------------------------
> 2 files changed, 64 insertions(+), 121 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 491039aeb61b..1211eb469d06 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2100,16 +2100,13 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
> break;
> case MSR_IA32_S_CET:
> + msr_info->data = vmcs_readl(GUEST_S_CET);
> + break;
> case MSR_KVM_GUEST_SSP:
> + msr_info->data = vmcs_readl(GUEST_SSP);
> + break;
> case MSR_IA32_INT_SSP_TAB:
> - if (kvm_get_msr_common(vcpu, msr_info))
> - return 1;
> - if (msr_info->index == MSR_KVM_GUEST_SSP)
> - msr_info->data = vmcs_readl(GUEST_SSP);
> - else if (msr_info->index == MSR_IA32_S_CET)
> - msr_info->data = vmcs_readl(GUEST_S_CET);
> - else if (msr_info->index == MSR_IA32_INT_SSP_TAB)
> - msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
> + msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
> break;
> case MSR_IA32_DEBUGCTLMSR:
> msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
> @@ -2432,25 +2429,14 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> else
> vmx->pt_desc.guest.addr_a[index / 2] = data;
> break;
> - case MSR_IA32_PL0_SSP ... MSR_IA32_PL2_SSP:
> - if (kvm_set_msr_common(vcpu, msr_info))
> - return 1;
> - if (data) {
> - vmx_disable_write_intercept_sss_msr(vcpu);
> - wrmsrl(msr_index, data);
> - }
> - break;
> case MSR_IA32_S_CET:
> + vmcs_writel(GUEST_S_CET, data);
> + break;
> case MSR_KVM_GUEST_SSP:
> + vmcs_writel(GUEST_SSP, data);
> + break;
> case MSR_IA32_INT_SSP_TAB:
> - if (kvm_set_msr_common(vcpu, msr_info))
> - return 1;
> - if (msr_index == MSR_KVM_GUEST_SSP)
> - vmcs_writel(GUEST_SSP, data);
> - else if (msr_index == MSR_IA32_S_CET)
> - vmcs_writel(GUEST_S_CET, data);
> - else if (msr_index == MSR_IA32_INT_SSP_TAB)
> - vmcs_writel(GUEST_INTR_SSP_TABLE, data);
> + vmcs_writel(GUEST_INTR_SSP_TABLE, data);
> break;
> case MSR_IA32_PERF_CAPABILITIES:
> if (data && !vcpu_to_pmu(vcpu)->version)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7385fc25a987..75e6de7c9268 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1838,6 +1838,11 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type)
> }
> EXPORT_SYMBOL_GPL(kvm_msr_allowed);
>
> +#define CET_US_RESERVED_BITS GENMASK(9, 6)
> +#define CET_US_SHSTK_MASK_BITS GENMASK(1, 0)
> +#define CET_US_IBT_MASK_BITS (GENMASK_ULL(5, 2) | GENMASK_ULL(63, 10))
> +#define CET_US_LEGACY_BITMAP_BASE(data) ((data) >> 12)
> +
> /*
> * Write @data into the MSR specified by @index. Select MSR specific fault
> * checks are bypassed if @host_initiated is %true.
> @@ -1897,6 +1902,35 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data,
>
> data = (u32)data;
> break;
> + case MSR_IA32_U_CET:
> + case MSR_IA32_S_CET:
> + if (!guest_can_use(vcpu, X86_FEATURE_SHSTK) &&
> + !guest_can_use(vcpu, X86_FEATURE_IBT))
> + return 1;
> + if (data & CET_US_RESERVED_BITS)
> + return 1;
> + if (!guest_can_use(vcpu, X86_FEATURE_SHSTK) &&
> + (data & CET_US_SHSTK_MASK_BITS))
> + return 1;
> + if (!guest_can_use(vcpu, X86_FEATURE_IBT) &&
> + (data & CET_US_IBT_MASK_BITS))
> + return 1;
> + if (!IS_ALIGNED(CET_US_LEGACY_BITMAP_BASE(data), 4))
> + return 1;
> +
> + /* IBT can be suppressed iff the TRACKER isn't WAIT_ENDR. */
> + if ((data & CET_SUPPRESS) && (data & CET_WAIT_ENDBR))
> + return 1;
> + break;
> + case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB:
> + case MSR_KVM_GUEST_SSP:
> + if (!guest_can_use(vcpu, X86_FEATURE_SHSTK))
> + return 1;
> + if (is_noncanonical_address(data, vcpu))
> + return 1;
> + if (!IS_ALIGNED(data, 4))
> + return 1;
> + break;
> }
>
> msr.data = data;
> @@ -1940,6 +1974,17 @@ static int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data,
> !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
> return 1;
> break;
> + case MSR_IA32_U_CET:
> + case MSR_IA32_S_CET:
> + if (!guest_can_use(vcpu, X86_FEATURE_IBT) &&
> + !guest_can_use(vcpu, X86_FEATURE_SHSTK))
> + return 1;
> + break;
> + case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB:
> + case MSR_KVM_GUEST_SSP:
> + if (!guest_can_use(vcpu, X86_FEATURE_SHSTK))
> + return 1;
> + break;
> }
>
> msr.index = index;
> @@ -3640,47 +3685,6 @@ static bool kvm_is_msr_to_save(u32 msr_index)
> return false;
> }
>
> -static inline bool is_shadow_stack_msr(u32 msr)
> -{
> - return msr == MSR_IA32_PL0_SSP ||
> - msr == MSR_IA32_PL1_SSP ||
> - msr == MSR_IA32_PL2_SSP ||
> - msr == MSR_IA32_PL3_SSP ||
> - msr == MSR_IA32_INT_SSP_TAB ||
> - msr == MSR_KVM_GUEST_SSP;
> -}
> -
> -static bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu,
> - struct msr_data *msr)
> -{
> - if (is_shadow_stack_msr(msr->index)) {
> - if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
> - return false;
> -
> - /*
> - * This MSR is synthesized mainly for userspace access during
> - * Live Migration, it also can be accessed in SMM mode by VMM.
> - * Guest is not allowed to access this MSR.
> - */
> - if (msr->index == MSR_KVM_GUEST_SSP) {
> - if (IS_ENABLED(CONFIG_X86_64) && is_smm(vcpu))
> - return true;
> -
> - return msr->host_initiated;
> - }
> -
> - return msr->host_initiated ||
> - guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
> - }
> -
> - if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
> - !kvm_cpu_cap_has(X86_FEATURE_IBT))
> - return false;
> -
> - return msr->host_initiated ||
> - guest_cpuid_has(vcpu, X86_FEATURE_IBT) ||
> - guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
> -}
>
> int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> {
> @@ -4036,46 +4040,9 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> vcpu->arch.guest_fpu.xfd_err = data;
> break;
> #endif
> -#define CET_EXCLUSIVE_BITS (CET_SUPPRESS | CET_WAIT_ENDBR)
> -#define CET_CTRL_RESERVED_BITS GENMASK(9, 6)
> -#define CET_SHSTK_MASK_BITS GENMASK(1, 0)
> -#define CET_IBT_MASK_BITS (GENMASK_ULL(5, 2) | \
> - GENMASK_ULL(63, 10))
> -#define CET_LEG_BITMAP_BASE(data) ((data) >> 12)
> case MSR_IA32_U_CET:
> - case MSR_IA32_S_CET:
> - if (!kvm_cet_is_msr_accessible(vcpu, msr_info))
> - return 1;
> - if (!!(data & CET_CTRL_RESERVED_BITS))
> - return 1;
> - if (!guest_can_use(vcpu, X86_FEATURE_SHSTK) &&
> - (data & CET_SHSTK_MASK_BITS))
> - return 1;
> - if (!guest_can_use(vcpu, X86_FEATURE_IBT) &&
> - (data & CET_IBT_MASK_BITS))
> - return 1;
> - if (!IS_ALIGNED(CET_LEG_BITMAP_BASE(data), 4) ||
> - (data & CET_EXCLUSIVE_BITS) == CET_EXCLUSIVE_BITS)
> - return 1;
> - if (msr == MSR_IA32_U_CET)
> - kvm_set_xsave_msr(msr_info);
> - break;
> - case MSR_KVM_GUEST_SSP:
> - case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB:
> - if (!kvm_cet_is_msr_accessible(vcpu, msr_info))
> - return 1;
> - if (is_noncanonical_address(data, vcpu))
> - return 1;
> - if (!IS_ALIGNED(data, 4))
> - return 1;
> - if (msr == MSR_IA32_PL0_SSP || msr == MSR_IA32_PL1_SSP ||
> - msr == MSR_IA32_PL2_SSP) {
> - vcpu->arch.cet_s_ssp[msr - MSR_IA32_PL0_SSP] = data;
> - if (!vcpu->arch.cet_sss_active && data)
> - vcpu->arch.cet_sss_active = true;
> - } else if (msr == MSR_IA32_PL3_SSP) {
> - kvm_set_xsave_msr(msr_info);
> - }
> + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> + kvm_set_xsave_msr(msr_info);
> break;
> default:
> if (kvm_pmu_is_valid_msr(vcpu, msr))
> @@ -4436,17 +4403,8 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> break;
> #endif
> case MSR_IA32_U_CET:
> - case MSR_IA32_S_CET:
> - case MSR_KVM_GUEST_SSP:
> - case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB:
> - if (!kvm_cet_is_msr_accessible(vcpu, msr_info))
> - return 1;
> - if (msr == MSR_IA32_PL0_SSP || msr == MSR_IA32_PL1_SSP ||
> - msr == MSR_IA32_PL2_SSP) {
> - msr_info->data = vcpu->arch.cet_s_ssp[msr - MSR_IA32_PL0_SSP];
> - } else if (msr == MSR_IA32_U_CET || msr == MSR_IA32_PL3_SSP) {
> - kvm_get_xsave_msr(msr_info);
> - }
> + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> + kvm_get_xsave_msr(msr_info);
> break;
> default:
> if (kvm_pmu_is_valid_msr(vcpu, msr))
> @@ -7330,9 +7288,13 @@ static void kvm_probe_msr_to_save(u32 msr_index)
> break;
> case MSR_IA32_U_CET:
> case MSR_IA32_S_CET:
> + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
> + !kvm_cpu_cap_has(X86_FEATURE_IBT))
> + return;
> + break;
> case MSR_KVM_GUEST_SSP:
> case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB:
> - if (!kvm_is_cet_supported())
> + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
> return;
> break;
> default:
> @@ -9664,13 +9626,8 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
> kvm_caps.supported_xcr0 = host_xcr0 & KVM_SUPPORTED_XCR0;
> }
> if (boot_cpu_has(X86_FEATURE_XSAVES)) {
> - u32 eax, ebx, ecx, edx;
> -
> - cpuid_count(0xd, 1, &eax, &ebx, &ecx, &edx);
> rdmsrl(MSR_IA32_XSS, host_xss);
> kvm_caps.supported_xss = host_xss & KVM_SUPPORTED_XSS;
> - if (ecx & XFEATURE_MASK_CET_KERNEL)
> - kvm_caps.supported_xss |= XFEATURE_MASK_CET_KERNEL;
> }
>
> rdmsrl_safe(MSR_EFER, &host_efer);
>
> base-commit: efb9177acd7a4df5883b844e1ec9c69ef0899c9c
The code looks good to me except the handling of MSR_KVM_GUEST_SSP,
non-host-initiated read/write should be prevented.
Powered by blists - more mailing lists