lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXE7TwZU8_3N8xUGUV+XMxC6LcTMCAC0UKpr6i_Hwx=VfQ@mail.gmail.com>
Date: Thu, 15 May 2025 11:12:26 +0100
From: Ard Biesheuvel <ardb@...nel.org>
To: Ingo Molnar <mingo@...nel.org>
Cc: Ard Biesheuvel <ardb+git@...gle.com>, linux-kernel@...r.kernel.org, x86@...nel.org, 
	Linus Torvalds <torvalds@...ux-foundation.org>, Brian Gerst <brgerst@...il.com>
Subject: Re: [PATCH v3 1/7] x86/cpu: Use a new feature flag for 5 level paging

On Thu, 15 May 2025 at 08:45, Ingo Molnar <mingo@...nel.org> wrote:
>
>
> * Ard Biesheuvel <ardb+git@...gle.com> wrote:
>
> > From: Ard Biesheuvel <ardb@...nel.org>
> >
> > Currently, the LA57 CPU feature flag is taken to mean two different
> > things at once:
> > - whether the CPU implements the LA57 extension, and is therefore
> >   capable of supporting 5 level paging;
> > - whether 5 level paging is currently in use.
> >
> > This means the LA57 capability of the hardware is hidden when a LA57
> > capable CPU is forced to run with 4 levels of paging. It also means the
> > the ordinary CPU capability detection code will happily set the LA57
> > capability and it needs to be cleared explicitly afterwards to avoid
> > inconsistencies.
> >
> > Separate the two so that the CPU hardware capability can be identified
> > unambigously in all cases.
> >
> > To avoid breaking existing users that might assume that 5 level paging
> > is being used when the "la57" string is visible in /proc/cpuinfo,
> > repurpose that string to mean that 5-level paging is in use, and add a
> > new string la57_capable to indicate that the CPU feature is implemented
> > by the hardware.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@...nel.org>
> > ---
> >  arch/x86/include/asm/cpufeatures.h               |  3 ++-
> >  arch/x86/include/asm/page_64.h                   |  2 +-
> >  arch/x86/include/asm/pgtable_64_types.h          |  2 +-
> >  arch/x86/kernel/cpu/common.c                     | 16 ++--------------
> >  arch/x86/kvm/x86.h                               |  4 ++--
> >  drivers/iommu/amd/init.c                         |  4 ++--
> >  drivers/iommu/intel/svm.c                        |  4 ++--
> >  tools/testing/selftests/kvm/x86/set_sregs_test.c |  2 +-
> >  8 files changed, 13 insertions(+), 24 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> > index f67a93fc9391..d59bee5907e7 100644
> > --- a/arch/x86/include/asm/cpufeatures.h
> > +++ b/arch/x86/include/asm/cpufeatures.h
> > @@ -395,7 +395,7 @@
> >  #define X86_FEATURE_AVX512_BITALG    (16*32+12) /* "avx512_bitalg" Support for VPOPCNT[B,W] and VPSHUF-BITQMB instructions */
> >  #define X86_FEATURE_TME                      (16*32+13) /* "tme" Intel Total Memory Encryption */
> >  #define X86_FEATURE_AVX512_VPOPCNTDQ (16*32+14) /* "avx512_vpopcntdq" POPCNT for vectors of DW/QW */
> > -#define X86_FEATURE_LA57             (16*32+16) /* "la57" 5-level page tables */
> > +#define X86_FEATURE_LA57             (16*32+16) /* "la57_hw" 5-level page tables */
> >  #define X86_FEATURE_RDPID            (16*32+22) /* "rdpid" RDPID instruction */
> >  #define X86_FEATURE_BUS_LOCK_DETECT  (16*32+24) /* "bus_lock_detect" Bus Lock detect */
> >  #define X86_FEATURE_CLDEMOTE         (16*32+25) /* "cldemote" CLDEMOTE instruction */
> > @@ -483,6 +483,7 @@
> >  #define X86_FEATURE_PREFER_YMM               (21*32+ 8) /* Avoid ZMM registers due to downclocking */
> >  #define X86_FEATURE_APX                      (21*32+ 9) /* Advanced Performance Extensions */
> >  #define X86_FEATURE_INDIRECT_THUNK_ITS       (21*32+10) /* Use thunk for indirect branches in lower half of cacheline */
> > +#define X86_FEATURE_5LEVEL_PAGING    (21*32+11) /* "la57" Whether 5 levels of page tables are in use */
>
> So there's a new complication here, KVM doesn't like the use of
> synthethic CPU flags, for understandable reasons:
>
>   inlined from ‘intel_pmu_set_msr’ at arch/x86/kvm/vmx/pmu_intel.c:369:7:
>   ...
>   ./arch/x86/kvm/reverse_cpuid.h:102:9: note: in expansion of macro ‘BUILD_BUG_ON’
>     102 |         BUILD_BUG_ON(x86_leaf == CPUID_LNX_5);
>         |         ^~~~~~~~~~~~
>
> (See x86-64 allmodconfig)
>
> Even though previously X86_FEATURE_LA57 was effectively a synthethic
> CPU flag (it got artificially turned off by the Linux kernel if 5-level
> paging was disabled) ...
>
> So I think the most straightforward solution would be to do the change
> below, and pass through LA57 flag if 5-level paging is enabled in the
> host kernel. This is similar to as if the firmware turned off LA57, and
> it doesn't bring in all the early-boot complications bare metal has. It
> should also match the previous behavior I think.
>
> Thoughts?
>
> Thanks,
>
>         Ingo
>
> =================>
>
>  arch/x86/kvm/cpuid.c | 6 ++++++
>  arch/x86/kvm/x86.h   | 4 ++--
>  2 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 571c906ffcbf..d951d71aea3b 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -1221,6 +1221,12 @@ void kvm_set_cpu_caps(void)
>                 kvm_cpu_cap_clear(X86_FEATURE_RDTSCP);
>                 kvm_cpu_cap_clear(X86_FEATURE_RDPID);
>         }
> +       /*
> +        * Clear the LA57 flag in the guest if the host kernel
> +        * does not have 5-level paging support:
> +        */
> +       if (kvm_cpu_cap_has(X86_FEATURE_LA57) && !pgtable_l5_enabled())
> +               kvm_cpu_cap_clear(X86_FEATURE_LA57);
>  }
>  EXPORT_SYMBOL_GPL(kvm_set_cpu_caps);
>
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index d2c093f17ae5..9dc32a409076 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -243,7 +243,7 @@ static inline u8 vcpu_virt_addr_bits(struct kvm_vcpu *vcpu)
>
>  static inline u8 max_host_virt_addr_bits(void)
>  {
> -       return kvm_cpu_cap_has(X86_FEATURE_5LEVEL_PAGING) ? 57 : 48;
> +       return kvm_cpu_cap_has(X86_FEATURE_LA57) ? 57 : 48;

Based on the comment that documents is_noncanonical_address(), it
seems to me that this should be using cpu_has(X86_FEATURE_LA57) rather
than kvm_cpu_cap_has(X86_FEATURE_LA57). Whether the host actually runs
with 5-level paging should be irrelevant, it only matters whether it
is capable of doing so.

>  }
>
>  /*
> @@ -603,7 +603,7 @@ static inline bool __kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>                 __reserved_bits |= X86_CR4_FSGSBASE;    \
>         if (!__cpu_has(__c, X86_FEATURE_PKU))           \
>                 __reserved_bits |= X86_CR4_PKE;         \
> -       if (!__cpu_has(__c, X86_FEATURE_5LEVEL_PAGING))          \
> +       if (!__cpu_has(__c, X86_FEATURE_LA57))          \
>                 __reserved_bits |= X86_CR4_LA57;        \

This could be

if (!__cpu_has(__c, X86_FEATURE_LA57) || !pgtable_l5_enabled())
    __reserved_bits |= X86_CR4_LA57;

>         if (!__cpu_has(__c, X86_FEATURE_UMIP))          \
>                 __reserved_bits |= X86_CR4_UMIP;        \

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ