linux-kernel - Re: [PATCH v9 08/22] KVM: VMX: Set FRED MSR intercepts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aRQf1sQZ9Z3CTB8i@intel.com>
Date: Wed, 12 Nov 2025 13:49:10 +0800
From: Chao Gao <chao.gao@...el.com>
To: "Xin Li (Intel)" <xin@...or.com>
CC: <linux-kernel@...r.kernel.org>, <kvm@...r.kernel.org>,
	<linux-doc@...r.kernel.org>, <pbonzini@...hat.com>, <seanjc@...gle.com>,
	<corbet@....net>, <tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
	<dave.hansen@...ux.intel.com>, <x86@...nel.org>, <hpa@...or.com>,
	<luto@...nel.org>, <peterz@...radead.org>, <andrew.cooper3@...rix.com>,
	<hch@...radead.org>, <sohil.mehta@...el.com>
Subject: Re: [PATCH v9 08/22] KVM: VMX: Set FRED MSR intercepts

On Sun, Oct 26, 2025 at 01:18:56PM -0700, Xin Li (Intel) wrote:
>From: Xin Li <xin3.li@...el.com>
>
>On a userspace MSR filter change, set FRED MSR intercepts.
>
>The eight FRED MSRs, MSR_IA32_FRED_RSP[123], MSR_IA32_FRED_STKLVLS,
>MSR_IA32_FRED_SSP[123] and MSR_IA32_FRED_CONFIG, are all safe to
>passthrough, because each has a corresponding host and guest field
>in VMCS.

Sean prefers to pass through MSRs only when there is a reason to do that rather
than just because it is free. My thinking is that RSPs and SSPs are per-task
and are context-switched frequently, so we need to pass through them. But I am
not sure if there is a reason for STKLVLS and CONFIG.

[*] https://lore.kernel.org/all/aKTGVvOb8PZ7mzVr@google.com/

>
>Both MSR_IA32_FRED_RSP0 and MSR_IA32_FRED_SSP0 (aka MSR_IA32_PL0_SSP)
>are dedicated for userspace event delivery, IOW they are NOT used in
>any kernel event delivery and the execution of ERETS.  Thus KVM can
>run safely with guest values in the two MSRs.  As a result, save and
>restore of their guest values are deferred until vCPU context switch,
>Host MSR_IA32_FRED_RSP0 is restored upon returning to userspace, and
>Host MSR_IA32_PL0_SSP is managed with XRSTORS/XSAVES.
>
>Note, FRED SSP MSRs, including MSR_IA32_PL0_SSP, are available on
>any processor that enumerates FRED.  On processors that support FRED
>but not CET, FRED transitions do not use these MSRs, but they remain
>accessible via MSR instructions such as RDMSR and WRMSR.
>
>Intercept MSR_IA32_PL0_SSP when CET shadow stack is not supported,
>regardless of FRED support.  This ensures the guest value remains
>fully virtual and does not modify the hardware FRED SSP0 MSR.
>
>This behavior is consistent with the current setup in
>vmx_recalc_msr_intercepts(), so no change is needed to the interception
>logic for MSR_IA32_PL0_SSP.
>
>Signed-off-by: Xin Li <xin3.li@...el.com>
>Signed-off-by: Xin Li (Intel) <xin@...or.com>
>Tested-by: Shan Kang <shan.kang@...el.com>
>Tested-by: Xuelian Guo <xuelian.guo@...el.com>
>---
>
>Changes in v7:
>* Rewrite the changelog and comment, majorly for MSR_IA32_PL0_SSP.
>
>Changes in v5:
>* Skip execution of vmx_set_intercept_for_fred_msr() if FRED is
>  not available or enabled (Sean).
>* Use 'intercept' as the variable name to indicate whether MSR
>  interception should be enabled (Sean).
>* Add TB from Xuelian Guo.
>---
> arch/x86/kvm/vmx/vmx.c | 47 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 47 insertions(+)
>
>diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>index c8b5359123bf..ef9765779884 100644
>--- a/arch/x86/kvm/vmx/vmx.c
>+++ b/arch/x86/kvm/vmx/vmx.c
>@@ -4146,6 +4146,51 @@ void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu)
> 	}
> }
> 
>+static void vmx_set_intercept_for_fred_msr(struct kvm_vcpu *vcpu)
>+{
>+	bool intercept = !guest_cpu_cap_has(vcpu, X86_FEATURE_FRED);
>+
>+	if (!kvm_cpu_cap_has(X86_FEATURE_FRED))
>+		return;
>+
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP1, MSR_TYPE_RW, intercept);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP2, MSR_TYPE_RW, intercept);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP3, MSR_TYPE_RW, intercept);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_STKLVLS, MSR_TYPE_RW, intercept);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_SSP1, MSR_TYPE_RW, intercept);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_SSP2, MSR_TYPE_RW, intercept);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_SSP3, MSR_TYPE_RW, intercept);
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_CONFIG, MSR_TYPE_RW, intercept);
>+
>+	/*
>+	 * MSR_IA32_FRED_RSP0 and MSR_IA32_PL0_SSP (aka MSR_IA32_FRED_SSP0) are
>+	 * designed for event delivery while executing in userspace.  Since KVM
>+	 * operates entirely in kernel mode (CPL is always 0 after any VM exit),
>+	 * it can safely retain and operate with guest-defined values for these
>+	 * MSRs.
>+	 *
>+	 * As a result, interception of MSR_IA32_FRED_RSP0 and MSR_IA32_PL0_SSP
>+	 * is unnecessary.

I think it would be slightly better to document why MSRs need to be passed
through rather than just why it is safe to pass through.

>+	 *
>+	 * Note: Saving and restoring MSR_IA32_PL0_SSP is part of CET supervisor
>+	 * context management.  However, FRED SSP MSRs, including MSR_IA32_PL0_SSP,
>+	 * are available on any processor that enumerates FRED.
>+	 *
>+	 * On processors that support FRED but not CET, FRED transitions do not
>+	 * use these MSRs, but they remain accessible via MSR instructions such
>+	 * as RDMSR and WRMSR.
>+	 *
>+	 * Intercept MSR_IA32_PL0_SSP when CET shadow stack is not supported,
>+	 * regardless of FRED support.  This ensures the guest value remains
>+	 * fully virtual and does not modify the hardware FRED SSP0 MSR.

Modifying the hardware MSR itself isn't a problem. The problem is that the
MSR isn't supposed to be accessed frequently in the guest if CET isn't
supported and will never be accessed via XSAVES. So, there is no good reason
to pass through it. And passing through the MSR means KVM needs to
context-switch it along with vcpu load/put, i.e., more code and complexity.

>+	 *
>+	 * This behavior is consistent with the current setup in
>+	 * vmx_recalc_msr_intercepts(), so no change is needed to the interception
>+	 * logic for MSR_IA32_PL0_SSP.
>+	 */
>+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP0, MSR_TYPE_RW, intercept);
>+}
>+