linux-kernel - Re: [PATCH v15 09/41] KVM: x86: Load guest FPU state when access XSAVE-managed MSRs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <65184bb3-0a36-49c3-b212-4b19f2df7092@zytor.com>
Date: Mon, 15 Sep 2025 10:04:51 -0700
From: Xin Li <xin@...or.com>
To: Sean Christopherson <seanjc@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        Tom Lendacky <thomas.lendacky@....com>,
        Mathias Krause <minipli@...ecurity.net>,
        John Allen <john.allen@....com>,
        Rick Edgecombe <rick.p.edgecombe@...el.com>,
        Chao Gao <chao.gao@...el.com>, Maxim Levitsky <mlevitsk@...hat.com>,
        Xiaoyao Li <xiaoyao.li@...el.com>,
        Zhang Yi Z <yi.z.zhang@...ux.intel.com>
Subject: Re: [PATCH v15 09/41] KVM: x86: Load guest FPU state when access
 XSAVE-managed MSRs

On 9/12/2025 4:22 PM, Sean Christopherson wrote:
> Load the guest's FPU state if userspace is accessing MSRs whose values
> are managed by XSAVES. Introduce two helpers, kvm_{get,set}_xstate_msr(),
> to facilitate access to such kind of MSRs.
> 
> If MSRs supported in kvm_caps.supported_xss are passed through to guest,
> the guest MSRs are swapped with host's before vCPU exits to userspace and
> after it reenters kernel before next VM-entry.
> 
> Because the modified code is also used for the KVM_GET_MSRS device ioctl(),
> explicitly check @vcpu is non-null before attempting to load guest state.
> The XSAVE-managed MSRs cannot be retrieved via the device ioctl() without
> loading guest FPU state (which doesn't exist).
> 
> Note that guest_cpuid_has() is not queried as host userspace is allowed to
> access MSRs that have not been exposed to the guest, e.g. it might do
> KVM_SET_MSRS prior to KVM_SET_CPUID2.
> 
> The two helpers are put here in order to manifest accessing xsave-managed
> MSRs requires special check and handling to guarantee the correctness of
> read/write to the MSRs.
> 
> Co-developed-by: Yang Weijiang <weijiang.yang@...el.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@...el.com>
> Reviewed-by: Maxim Levitsky <mlevitsk@...hat.com>
> Tested-by: Mathias Krause <minipli@...ecurity.net>
> Tested-by: John Allen <john.allen@....com>
> Tested-by: Rick Edgecombe <rick.p.edgecombe@...el.com>
> Signed-off-by: Chao Gao <chao.gao@...el.com>
> [sean: drop S_CET, add big comment, move accessors to x86.c]
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>


Reviewed-by: Xin Li (Intel) <xin@...or.com>


> ---
>   arch/x86/kvm/x86.c | 86 +++++++++++++++++++++++++++++++++++++++++++++-
>   1 file changed, 85 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c5e38d6943fe..a95ca2fbd3a9 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3801,6 +3804,66 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
>   	mark_page_dirty_in_slot(vcpu->kvm, ghc->memslot, gpa_to_gfn(ghc->gpa));
>   }
>   
> +/*
> + * Returns true if the MSR in question is managed via XSTATE, i.e. is context
> + * switched with the rest of guest FPU state.  Note!  S_CET is _not_ context
> + * switched via XSTATE even though it _is_ saved/restored via XSAVES/XRSTORS.
> + * Because S_CET is loaded on VM-Enter and VM-Exit via dedicated VMCS fields,
> + * the value saved/restored via XSTATE is always the host's value.  That detail
> + * is _extremely_ important, as the guest's S_CET must _never_ be resident in
> + * hardware while executing in the host.  Loading guest values for U_CET and
> + * PL[0-3]_SSP while executing in the kernel is safe, as U_CET is specific to
> + * userspace, and PL[0-3]_SSP are only consumed when transitioning to lower
> + * privilegel levels, i.e. are effectively only consumed by userspace as well.
> + */
> +static bool is_xstate_managed_msr(struct kvm_vcpu *vcpu, u32 msr)
> +{
> +	if (!vcpu)
> +		return false;
> +
> +	switch (msr) {
> +	case MSR_IA32_U_CET:
> +		return guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK) ||
> +		       guest_cpu_cap_has(vcpu, X86_FEATURE_IBT);
> +	case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> +		return guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK);
> +	default:
> +		return false;
> +	}
> +}

With this new version of is_xstate_managed_msr(), which checks against vcpu
capabilities instead of KVM, patch 9 of KVM FRED patches[1] no longer needs
to make any change to it.  And this is the only conflict when I apply KVM
FRED patches on top of this v15 mega-CET patch series.

[1] https://lore.kernel.org/lkml/20250829153149.2871901-10-xin@zytor.com/