[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whxP5Ywpv-U=2NPFhk929VHB9_kdp10+HFJQ4VxEGdX9A@mail.gmail.com>
Date: Tue, 6 May 2025 10:03:16 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Ard Biesheuvel <ardb@...nel.org>
Cc: Ard Biesheuvel <ardb+git@...gle.com>, linux-kernel@...r.kernel.org, x86@...nel.org,
Ingo Molnar <mingo@...nel.org>
Subject: Re: [RFC PATCH 3/3] x86/boot: Use alternatives based selector for
5-level paging constants
On Tue, 6 May 2025 at 09:35, Ard Biesheuvel <ardb@...nel.org> wrote:
>
> I think the first two patches are important, though, as they are about
> robustness/consistency rather than optimization.
That first patch already has another copy of that insane inline asm
optimization.
Let's fix this properly, not this disgusting way.
Let's get rid of that USE_EARLY_PGTABLE_L5 crazy case entirely, and
fix the few places where it is currently used.
That code is bogus *anyway*, because your argument for these patches
is "one single truth", but the fact is, there's at least *SIX*
different values that get set depending on this value: pgdir_shift,
ptrs_per_p4d, vmalloc_base, etc. All of those change depending on
whether we do 5-level page tables or not, so the whole argument that
"pgtable_l5_enabled() is special" is just wrong to begin with.
Any code that makes pgtable_l5_enabled() will fundamentally then just
have *another* inconsistency, namely the inconsistency between that
"is_enabled" and all the other values that L5 paging actually
modifies.
So I don't think your patches even fix anything. They only paper over
one very particular issue.
For example, as far as I can tell, the only real reason for it in
arch/x86/kernel/cpu/common.c is *one* single use where we do that
if (!pgtable_l5_enabled())
setup_clear_cpu_cap(X86_FEATURE_LA57);
in early_identify_cpu().
And then we have __early_make_pgtable(), but that's the SAME FILE that
has all the magical special __pgtable_l5_enabled logic anyway. So that
damn well could just write out the actual real logic, instead of using
that "is L5 enabled" helper function THAT IT IS ITSELF INITIALIZING.
So I reall ythink this whole issue goes much deeper, and is much more
broken than your patches imply. And I think your patches in many ways
make it *worse*, because they may make that pgtable_l5_enabled() be
set up early, but that just hides all the *other* issues that aren't.,
As a very real example of that, just look at what happens in
arch/x86/mm/mem_encrypt_identity.c
It does all that page table setup, but if __pgtable_l5_enabled hassn't
been set up yet, then all those *othger* values that go with it also
haven't been set up yet, so now it uses the *wrong* value for
ptrs_per_p4d etc.
And I just checked: that code very much does use ptrs_per_p4d,
although it's hidden by
memset(p4d, 0, sizeof(*p4d) * PTRS_PER_P4D);
so I really think this is all papering over things. That code CANNOT
WORK before __pgtable_l5_enabled hass been initialized in its
*current* location.
IOW, I think your patches only make things *less* consistent, not more.
Linus
Powered by blists - more mailing lists