lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aCWbsM1qaMsKNkMj@gmail.com>
Date: Thu, 15 May 2025 09:45:52 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Ard Biesheuvel <ardb+git@...gle.com>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org,
	Ard Biesheuvel <ardb@...nel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Brian Gerst <brgerst@...il.com>
Subject: Re: [PATCH v3 1/7] x86/cpu: Use a new feature flag for 5 level paging


* Ard Biesheuvel <ardb+git@...gle.com> wrote:

> From: Ard Biesheuvel <ardb@...nel.org>
> 
> Currently, the LA57 CPU feature flag is taken to mean two different
> things at once:
> - whether the CPU implements the LA57 extension, and is therefore
>   capable of supporting 5 level paging;
> - whether 5 level paging is currently in use.
> 
> This means the LA57 capability of the hardware is hidden when a LA57
> capable CPU is forced to run with 4 levels of paging. It also means the
> the ordinary CPU capability detection code will happily set the LA57
> capability and it needs to be cleared explicitly afterwards to avoid
> inconsistencies.
> 
> Separate the two so that the CPU hardware capability can be identified
> unambigously in all cases.
> 
> To avoid breaking existing users that might assume that 5 level paging
> is being used when the "la57" string is visible in /proc/cpuinfo,
> repurpose that string to mean that 5-level paging is in use, and add a
> new string la57_capable to indicate that the CPU feature is implemented
> by the hardware.
> 
> Signed-off-by: Ard Biesheuvel <ardb@...nel.org>
> ---
>  arch/x86/include/asm/cpufeatures.h               |  3 ++-
>  arch/x86/include/asm/page_64.h                   |  2 +-
>  arch/x86/include/asm/pgtable_64_types.h          |  2 +-
>  arch/x86/kernel/cpu/common.c                     | 16 ++--------------
>  arch/x86/kvm/x86.h                               |  4 ++--
>  drivers/iommu/amd/init.c                         |  4 ++--
>  drivers/iommu/intel/svm.c                        |  4 ++--
>  tools/testing/selftests/kvm/x86/set_sregs_test.c |  2 +-
>  8 files changed, 13 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index f67a93fc9391..d59bee5907e7 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -395,7 +395,7 @@
>  #define X86_FEATURE_AVX512_BITALG	(16*32+12) /* "avx512_bitalg" Support for VPOPCNT[B,W] and VPSHUF-BITQMB instructions */
>  #define X86_FEATURE_TME			(16*32+13) /* "tme" Intel Total Memory Encryption */
>  #define X86_FEATURE_AVX512_VPOPCNTDQ	(16*32+14) /* "avx512_vpopcntdq" POPCNT for vectors of DW/QW */
> -#define X86_FEATURE_LA57		(16*32+16) /* "la57" 5-level page tables */
> +#define X86_FEATURE_LA57		(16*32+16) /* "la57_hw" 5-level page tables */
>  #define X86_FEATURE_RDPID		(16*32+22) /* "rdpid" RDPID instruction */
>  #define X86_FEATURE_BUS_LOCK_DETECT	(16*32+24) /* "bus_lock_detect" Bus Lock detect */
>  #define X86_FEATURE_CLDEMOTE		(16*32+25) /* "cldemote" CLDEMOTE instruction */
> @@ -483,6 +483,7 @@
>  #define X86_FEATURE_PREFER_YMM		(21*32+ 8) /* Avoid ZMM registers due to downclocking */
>  #define X86_FEATURE_APX			(21*32+ 9) /* Advanced Performance Extensions */
>  #define X86_FEATURE_INDIRECT_THUNK_ITS	(21*32+10) /* Use thunk for indirect branches in lower half of cacheline */
> +#define X86_FEATURE_5LEVEL_PAGING	(21*32+11) /* "la57" Whether 5 levels of page tables are in use */

So there's a new complication here, KVM doesn't like the use of 
synthethic CPU flags, for understandable reasons:

  inlined from ‘intel_pmu_set_msr’ at arch/x86/kvm/vmx/pmu_intel.c:369:7:
  ...
  ./arch/x86/kvm/reverse_cpuid.h:102:9: note: in expansion of macro ‘BUILD_BUG_ON’
    102 |         BUILD_BUG_ON(x86_leaf == CPUID_LNX_5);
        |         ^~~~~~~~~~~~

(See x86-64 allmodconfig)

Even though previously X86_FEATURE_LA57 was effectively a synthethic 
CPU flag (it got artificially turned off by the Linux kernel if 5-level 
paging was disabled) ...

So I think the most straightforward solution would be to do the change 
below, and pass through LA57 flag if 5-level paging is enabled in the 
host kernel. This is similar to as if the firmware turned off LA57, and 
it doesn't bring in all the early-boot complications bare metal has. It 
should also match the previous behavior I think.

Thoughts?

Thanks,

	Ingo

=================>

 arch/x86/kvm/cpuid.c | 6 ++++++
 arch/x86/kvm/x86.h   | 4 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 571c906ffcbf..d951d71aea3b 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -1221,6 +1221,12 @@ void kvm_set_cpu_caps(void)
 		kvm_cpu_cap_clear(X86_FEATURE_RDTSCP);
 		kvm_cpu_cap_clear(X86_FEATURE_RDPID);
 	}
+	/*
+	 * Clear the LA57 flag in the guest if the host kernel
+	 * does not have 5-level paging support:
+	 */
+	if (kvm_cpu_cap_has(X86_FEATURE_LA57) && !pgtable_l5_enabled())
+		kvm_cpu_cap_clear(X86_FEATURE_LA57);
 }
 EXPORT_SYMBOL_GPL(kvm_set_cpu_caps);
 
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index d2c093f17ae5..9dc32a409076 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -243,7 +243,7 @@ static inline u8 vcpu_virt_addr_bits(struct kvm_vcpu *vcpu)
 
 static inline u8 max_host_virt_addr_bits(void)
 {
-	return kvm_cpu_cap_has(X86_FEATURE_5LEVEL_PAGING) ? 57 : 48;
+	return kvm_cpu_cap_has(X86_FEATURE_LA57) ? 57 : 48;
 }
 
 /*
@@ -603,7 +603,7 @@ static inline bool __kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 		__reserved_bits |= X86_CR4_FSGSBASE;    \
 	if (!__cpu_has(__c, X86_FEATURE_PKU))           \
 		__reserved_bits |= X86_CR4_PKE;         \
-	if (!__cpu_has(__c, X86_FEATURE_5LEVEL_PAGING))          \
+	if (!__cpu_has(__c, X86_FEATURE_LA57))          \
 		__reserved_bits |= X86_CR4_LA57;        \
 	if (!__cpu_has(__c, X86_FEATURE_UMIP))          \
 		__reserved_bits |= X86_CR4_UMIP;        \

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ