linux-kernel - Re: [PATCH v6 06/20] KVM: VMX: Set FRED MSR intercepts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c45a7c91-e393-4f71-8b22-aef6486aaa9e@zytor.com>
Date: Tue, 26 Aug 2025 14:59:01 -0700
From: Xin Li <xin@...or.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-doc@...r.kernel.org, pbonzini@...hat.com, corbet@....net,
        tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
        dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
        luto@...nel.org, peterz@...radead.org, andrew.cooper3@...rix.com,
        chao.gao@...el.com, hch@...radead.org
Subject: Re: [PATCH v6 06/20] KVM: VMX: Set FRED MSR intercepts


>> Hi Sean,
>>
>> I'd like to bring up an issue concerning MSR_IA32_PL0_SSP.
>>
>> The FRED spec claims:
>>
>> The FRED SSP MSRs are supported by any processor that enumerates
>> CPUID.(EAX=7,ECX=1):EAX.FRED[bit 17] as 1. If such a processor does not
>> support CET, FRED transitions will not use the MSRs (because shadow stacks
>> are not enabled), but the MSRs would still be accessible using MSR-access
>> instructions (e.g., RDMSR, WRMSR).
>>
>> It means KVM needs to handle MSR_IA32_PL0_SSP even when FRED is supported
>> but CET is not.  And this can be broken down into two subtasks:
>>
>> 1) Allow such a guest to access MSR_IA32_PL0_SSP w/o triggering #GP.  And
>> this behavior is already implemented in patch 8 of this series.
>>
>> 2) Save and restore MSR_IA32_PL0_SSP in both KVM and Qemu for such a guest.
> 
> What novel work needs to be done in KVM?  For QEMU, I assume it's just adding an
> "or FRED" somewhere.  For KVM, I'm missing what additional work would be required
> that wouldn't be naturally covered by patch 8 (assuming patch 8 is bug-free).

Extra patches:

1) A patch to save/restore guest MSR_IA32_PL0_SSP (i.e., FRED SSP0), as
what we have done for RSP0, following is the patch on top of the patch 
saving/restoring RSP0:

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 449a5e02c7de..0bf684342a71 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1294,8 +1294,13 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)

  	wrmsrq(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base);

-	if (guest_cpu_cap_has(vcpu, X86_FEATURE_FRED))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_FRED)) {
  		wrmsrns(MSR_IA32_FRED_RSP0, vmx->msr_guest_fred_rsp0);
+
+		/* XSAVES/XRSTORS do not cover SSP MSRs */
+		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
+			wrmsrns(MSR_IA32_FRED_SSP0, vmx->msr_guest_fred_ssp0);
+	}
  #else
  	savesegment(fs, fs_sel);
  	savesegment(gs, gs_sel);
@@ -1349,6 +1354,10 @@ static void vmx_prepare_switch_to_host(struct 
vcpu_vmx *vmx)
  		 * CPU exits to userspace (RSP0 is a per-task value).
  		 */
  		fred_sync_rsp0(vmx->msr_guest_fred_rsp0);
+
+		/* XSAVES/XRSTORS do not cover SSP MSRs */
+		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
+			vmx->msr_guest_fred_ssp0 = read_msr(MSR_IA32_FRED_SSP0);
  	}
  #endif
  	load_fixmap_gdt(raw_smp_processor_id());
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 733fa2ef4bea..12c1cf827cb7 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -228,6 +228,7 @@ struct vcpu_vmx {
  #ifdef CONFIG_X86_64
  	u64		      msr_guest_kernel_gs_base;
  	u64		      msr_guest_fred_rsp0;
+	u64		      msr_guest_fred_ssp0;
  #endif

  	u64		      spec_ctrl;

And We might want to zero host MSR_IA32_PL0_SSP when switching to host.


2) Add vmx_read_guest_fred_ssp0()/vmx_write_guest_fred_ssp0(), and use
them to read/write MSR_IA32_PL0_SSP in patch 8:

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 99106750b1e3..cbdc67682d27 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1400,9 +1408,23 @@ static void vmx_write_guest_fred_rsp0(struct 
vcpu_vmx *vmx, u64 data)
  	vmx_write_guest_host_msr(vmx, MSR_IA32_FRED_RSP0, data,
  				 &vmx->msr_guest_fred_rsp0);
  }
+
+static u64 vmx_read_guest_fred_ssp0(struct vcpu_vmx *vmx)
+{
+	return vmx_read_guest_host_msr(vmx, MSR_IA32_FRED_SSP0,
+				       &vmx->msr_guest_fred_ssp0);
+}
+
+static void vmx_write_guest_fred_ssp0(struct vcpu_vmx *vmx, u64 data)
+{
+	vmx_write_guest_host_msr(vmx, MSR_IA32_FRED_SSP0, data,
+				 &vmx->msr_guest_fred_ssp0);
+}
  #endif

  static void grow_ple_window(struct kvm_vcpu *vcpu)
@@ -2189,6 +2211,18 @@ int vmx_get_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
  	case MSR_IA32_DEBUGCTLMSR:
  		msr_info->data = vmx_guest_debugctl_read();
  		break;
+	case MSR_IA32_PL0_SSP:
+		/*
+		 * If kvm_cpu_cap_has(X86_FEATURE_SHSTK) but
+		 * !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK), XSAVES/XRSTORS
+		 * cover SSP MSRs.
+		 */
+		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
+		    guest_cpu_cap_has(vcpu, X86_FEATURE_FRED)) {
+			msr_info->data = vmx_read_guest_fred_ssp0(vmx);
+			break;
+		}
+		fallthrough;
  	default:
  	find_uret_msr:
  		msr = vmx_find_uret_msr(vmx, msr_info->index);
@@ -2540,7 +2574,18 @@ int vmx_set_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
  		}
  		ret = kvm_set_msr_common(vcpu, msr_info);
  		break;
-
+	case MSR_IA32_PL0_SSP:
+		/*
+		 * If kvm_cpu_cap_has(X86_FEATURE_SHSTK) but
+		 * !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK), XSAVES/XRSTORS
+		 * cover SSP MSRs.
+		 */
+		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
+		    guest_cpu_cap_has(vcpu, X86_FEATURE_FRED)) {
+			vmx_write_guest_fred_ssp0(vmx, data);
+			break;
+		}
+		fallthrough;
  	default:
  	find_uret_msr:
  		msr = vmx_find_uret_msr(vmx, msr_index);


3) Another change I was discussing with Chao:
https://lore.kernel.org/lkml/2ed04dff-e778-46c6-bd5f-51295763af06@zytor.com/

> 
>> I have the patches for 2) but they are not included in this series, because
>>
>> 1) how much do we care the value in MSR_IA32_PL0_SSP in such a guest?
>>
>> Yes, Chao told me that you are the one saying that MSRs can be used as
>> clobber registers and KVM should preserve the value.  Does MSR_IA32_PL0_SSP
>> in such a guest count?
> 
> If the architecture says that MSR_IA32_PL0_SSP exists and is accessible, then
> KVM needs to honor that.
> 
>> 2) Saving/restoring MSR_IA32_PL0_SSP adds complexity, though it's seldom
>> used.  Is it worth it?
> 
> Honoring the architecture is generally not optional.  There are extreme cases
> where KVM violates that rule and takes (often undocumented) erratum, e.g. APIC
> base relocation would require an absurd amount of complexity for no real world
> benefit.  But I would be very surprised if the complexity in KVM or QEMU to support
> this scenario is at all meaningful, let alone enough to justify diverging from
> the architectural spec.

Let me post v7 which includes all the required changes.

Thanks!
     Xin