[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXEYYSc8=qMmDW6E2kRFawK34okGvq=rTuhvv5hVPsd-iw@mail.gmail.com>
Date: Thu, 23 Oct 2025 16:13:26 +0200
From: Ard Biesheuvel <ardb@...nel.org>
To: Usama Arif <usamaarif642@...il.com>
Cc: dwmw@...zon.co.uk, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, hpa@...or.com, x86@...nel.org,
apopple@...dia.com, thuth@...hat.com, nik.borisov@...e.com, kas@...nel.org,
linux-kernel@...r.kernel.org, linux-efi@...r.kernel.org, kernel-team@...a.com,
Michael van der Westhuizen <rmikey@...a.com>, Tobias Fleig <tfleig@...a.com>
Subject: Re: [PATCH 2/3] efi/libstub: Fix page table access in 5-level to
4-level paging transition
On Thu, 23 Oct 2025 at 00:08, Usama Arif <usamaarif642@...il.com> wrote:
>
> When transitioning from 5-level to 4-level paging, the existing code
> incorrectly accesses page table entries by directly dereferencing CR3
> and applying PAGE_MASK. This approach has several issues:
>
> - __native_read_cr3() returns the raw CR3 register value, which on
> x86_64 includes not just the physical address but also flags Bits
> above the physical address width of the system (i.e. above
> __PHYSICAL_MASK_SHIFT) are also not masked.
> - The pgd value is masked by PAGE_SIZE which doesn't take into account
> the higher bits such as _PAGE_BIT_NOPTISHADOW.
>
> Replace this with proper accessor functions:
> - read_cr3_pa(): Uses CR3_ADDR_MASK properly clearing SME encryption bit
> and extracting only the physical address portion.
> - mask pgd value with PTE_PFN_MASK instead of PAGE_MASK, accounting for
> flags above physical address (_PAGE_BIT_NOPTISHADOW in particular).
>
> Fixes: cb1c9e02b0c1 ("x86/efistub: Perform 4/5 level paging switch from the stub")
> Co-developed-by: Kiryl Shutsemau <kas@...nel.org>
> Signed-off-by: Kiryl Shutsemau <kas@...nel.org>
> Signed-off-by: Usama Arif <usamaarif642@...il.com>
> Reported-by: Michael van der Westhuizen <rmikey@...a.com>
> Reported-by: Tobias Fleig <tfleig@...a.com>
> ---
> drivers/firmware/efi/libstub/x86-5lvl.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/firmware/efi/libstub/x86-5lvl.c b/drivers/firmware/efi/libstub/x86-5lvl.c
> index f1c5fb45d5f7c..34b72da457487 100644
> --- a/drivers/firmware/efi/libstub/x86-5lvl.c
> +++ b/drivers/firmware/efi/libstub/x86-5lvl.c
> @@ -81,8 +81,11 @@ void efi_5level_switch(void)
> new_cr3 = memset(pgt, 0, PAGE_SIZE);
> new_cr3[0] = (u64)cr3 | _PAGE_TABLE_NOENC;
> } else {
> + pgd_t *pgdp;
> +
> + pgdp = (pgd_t *)read_cr3_pa();
Shouldn't this be using native_read_cr3_pa()? And is there any reason
to re-read CR3 here, rather than update the code that populates the
cr3 variable? The preceding other branch of the if() should probably
use the same sanitised value of CR3, no?
> /* take the new root table pointer from the current entry #0 */
> - new_cr3 = (u64 *)(cr3[0] & PAGE_MASK);
> + new_cr3 = (u64 *)(pgd_val(pgdp[0]) & PTE_PFN_MASK);
>
> /* copy the new root table if it is not 32-bit addressable */
> if ((u64)new_cr3 > U32_MAX)
> --
> 2.47.3
>
Powered by blists - more mailing lists