lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <cbeaa101-b0fd-49e1-9319-f6070b799214@intel.com>
Date: Tue, 23 Sep 2025 14:37:03 +0800
From: Xiaoyao Li <xiaoyao.li@...el.com>
To: Sean Christopherson <seanjc@...gle.com>,
 Paolo Bonzini <pbonzini@...hat.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH (CET v16 26.5)] KVM: x86: Initialize
 allow_smaller_maxphyaddr earlier in setup

On 9/23/2025 2:47 AM, Sean Christopherson wrote:
> Initialize allow_smaller_maxphyaddr during hardware setup as soon as KVM
> knows whether or not TDP will be utilized.  To avoid having to teach KVM's
> emulator all about CET, KVM's upcoming CET virtualization support will be
> mutually exclusive with allow_smaller_maxphyaddr, i.e. will disable SHSTK
> and IBT if allow_smaller_maxphyaddr is enabled.
> 
> In general, allow_smaller_maxphyaddr should be initialized as soon as
> possible since it's globally visible while its only input is whether or
> not EPT/NPT is enabled.  I.e. there's effectively zero risk of setting
> allow_smaller_maxphyaddr too early, and substantial risk of setting it
> too late.
> 
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@...el.com>

> ---
> 
> As the subject suggests, I'm going to slot this in when applying the CET
> series as this is a dependency for disabling SHSTK and IBT if
> allow_smaller_maxphyaddr.  Without this, SVM will incorrectly clear (or not)
> SHSTK.  VMX isn't affected because !enable_ept disables unrestricted guest,
> which also clears SHSTK and IBT, but as the changelog calls out, there's no
> reason to wait to initialize allow_smaller_maxphyaddr.
> 
> https://lore.kernel.org/all/20250919223258.1604852-28-seanjc@google.com
> 
>   arch/x86/kvm/svm/svm.c | 30 +++++++++++++++---------------
>   arch/x86/kvm/vmx/vmx.c | 16 ++++++++--------
>   2 files changed, 23 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 54ca0ec5ea57..74a6e3868517 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -5413,6 +5413,21 @@ static __init int svm_hardware_setup(void)
>   			  get_npt_level(), PG_LEVEL_1G);
>   	pr_info("Nested Paging %s\n", str_enabled_disabled(npt_enabled));
>   
> +	/*
> +	 * It seems that on AMD processors PTE's accessed bit is
> +	 * being set by the CPU hardware before the NPF vmexit.
> +	 * This is not expected behaviour and our tests fail because
> +	 * of it.
> +	 * A workaround here is to disable support for
> +	 * GUEST_MAXPHYADDR < HOST_MAXPHYADDR if NPT is enabled.
> +	 * In this case userspace can know if there is support using
> +	 * KVM_CAP_SMALLER_MAXPHYADDR extension and decide how to handle
> +	 * it
> +	 * If future AMD CPU models change the behaviour described above,
> +	 * this variable can be changed accordingly
> +	 */
> +	allow_smaller_maxphyaddr = !npt_enabled;
> +
>   	/* Setup shadow_me_value and shadow_me_mask */
>   	kvm_mmu_set_me_spte_mask(sme_me_mask, sme_me_mask);
>   
> @@ -5492,21 +5507,6 @@ static __init int svm_hardware_setup(void)
>   
>   	svm_set_cpu_caps();
>   
> -	/*
> -	 * It seems that on AMD processors PTE's accessed bit is
> -	 * being set by the CPU hardware before the NPF vmexit.
> -	 * This is not expected behaviour and our tests fail because
> -	 * of it.
> -	 * A workaround here is to disable support for
> -	 * GUEST_MAXPHYADDR < HOST_MAXPHYADDR if NPT is enabled.
> -	 * In this case userspace can know if there is support using
> -	 * KVM_CAP_SMALLER_MAXPHYADDR extension and decide how to handle
> -	 * it
> -	 * If future AMD CPU models change the behaviour described above,
> -	 * this variable can be changed accordingly
> -	 */
> -	allow_smaller_maxphyaddr = !npt_enabled;
> -
>   	kvm_caps.inapplicable_quirks &= ~KVM_X86_QUIRK_CD_NW_CLEARED;
>   	return 0;
>   
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 509487a1f04a..ace8208fc1be 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8479,6 +8479,14 @@ __init int vmx_hardware_setup(void)
>   		return -EOPNOTSUPP;
>   	}
>   
> +	/*
> +	 * Shadow paging doesn't have a (further) performance penalty
> +	 * from GUEST_MAXPHYADDR < HOST_MAXPHYADDR so enable it
> +	 * by default
> +	 */
> +	if (!enable_ept)
> +		allow_smaller_maxphyaddr = true;
> +
>   	if (!cpu_has_vmx_ept_ad_bits() || !enable_ept)
>   		enable_ept_ad_bits = 0;
>   
> @@ -8715,14 +8723,6 @@ int __init vmx_init(void)
>   
>   	vmx_check_vmcs12_offsets();
>   
> -	/*
> -	 * Shadow paging doesn't have a (further) performance penalty
> -	 * from GUEST_MAXPHYADDR < HOST_MAXPHYADDR so enable it
> -	 * by default
> -	 */
> -	if (!enable_ept)
> -		allow_smaller_maxphyaddr = true;
> -
>   	return 0;
>   
>   err_l1d_flush:
> 
> base-commit: d44fa096b63659f2398a28f24d99e48c23857c82


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ