linux-kernel - [PATCH] x86: fix oops caused by old EFI info on kexec boot

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251126173209.374755-2-chewi@gentoo.org>
Date: Wed, 26 Nov 2025 17:32:10 +0000
From: James Le Cuirot <chewi@...too.org>
To: x86@...nel.org
Cc: linux-kernel@...r.kernel.org,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	Borislav Petkov <bp@...en8.de>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	"H . Peter Anvin" <hpa@...or.com>,
	Ard Biesheuvel <ardb@...nel.org>,
	James Le Cuirot <chewi@...too.org>
Subject: [PATCH] x86: fix oops caused by old EFI info on kexec boot

kexec on x86 passes initrd details via the boot_params. If no initrd is
supplied, then ramdisk_size is 0. When determining whether to reserve
memory for the initrd on the subsequent boot, ramdisk_size being 0
causes the logic to fall back to phys_initrd_start and phys_initrd_size
set from the EFI tables in efi.c. This is stale information from the
initial boot. The system continues to boot and has even been seen to
function under heavy load for days, but allocating very large amounts of
memory reliably triggers an oops rather than the OOM killer.

  BUG: kernel NULL pointer dereference, address: 0000000000000008
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 0 P4D 0
  Oops: Oops: 0002 [#1] SMP NOPTI

This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8
when the EFI stub initrd loading was unified between architectures.

Avoid the issue by checking whether the bootloader is not kexec before
falling back to the EFI table values.

I strongly suspect this also affects other architectures. A different
fix would be required there, and I do have a fix in mind, but I was
unable to reproduce the issue under QEMU's aarch64 virt machine. I think
this is at least partly because it relies on ACPI while kexec passes the
initd details via the device tree.

Signed-off-by: James Le Cuirot <chewi@...too.org>
---
 arch/x86/kernel/setup.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 1b2edd07a3e1..8aa65daf121f 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void)

 	ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32;

-	if (ramdisk_image == 0)
+	/* Don't fall back for kexec as phys_initrd_start will be stale */
+	if (ramdisk_image == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD)
 		ramdisk_image = phys_initrd_start;

 	return ramdisk_image;
@@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void)

 	ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32;

-	if (ramdisk_size == 0)
+	/* Don't fall back for kexec as phys_initrd_start will be stale */
+	if (ramdisk_size == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD)
 		ramdisk_size = phys_initrd_size;

 	return ramdisk_size;
-- 
2.51.2