[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z_4ISTuGo8VmZt9X@kernel.org>
Date: Tue, 15 Apr 2025 10:18:33 +0300
From: Mike Rapoport <rppt@...nel.org>
To: Dave Hansen <dave.hansen@...el.com>
Cc: linux-kernel@...r.kernel.org, linux-tip-commits@...r.kernel.org,
Arnd Bergmann <arnd@...nel.org>, Ingo Molnar <mingo@...nel.org>,
Andy Shevchenko <andy@...nel.org>, Arnd Bergmann <arnd@...db.de>,
Davide Ciminaghi <ciminaghi@...dd.com>,
"H. Peter Anvin" <hpa@...or.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>, kvm@...r.kernel.org,
x86@...nel.org
Subject: Re: [tip: x86/urgent] x86/e820: Discard high memory that can't be
addressed by 32-bit systems
On Mon, Apr 14, 2025 at 07:19:02AM -0700, Dave Hansen wrote:
> On 4/13/25 02:23, tip-bot2 for Mike Rapoport (Microsoft) wrote:
> > + /*
> > + * 32-bit systems are limited to 4BG of memory even with HIGHMEM and
> > + * to even less without it.
> > + * Discard memory after max_pfn - the actual limit detected at runtime.
> > + */
> > + if (IS_ENABLED(CONFIG_X86_32))
> > + memblock_remove(PFN_PHYS(max_pfn), -1);
>
> Mike, thanks for the quick fix! I did verify that this gets my silly
> test VM booting again.
>
> The patch obviously _works_. But in the case I was hitting max_pfn was
> set MAX_NONPAE_PFN. The unfortunate part about this hunk is that it's
> far away from the related warning:
Yeah, my first instinct was to put memblock_remove() in the same 'if',
but there's no memblock there yet :)
> > if (max_pfn > MAX_NONPAE_PFN) {
> > max_pfn = MAX_NONPAE_PFN;
> > printk(KERN_WARNING MSG_HIGHMEM_TRIMMED);
> > }
>
> and it's logically doing the same thing: truncating memory at
> MAX_NONPAE_PFN.
>
> How about we reuse 'MAX_NONPAE_PFN' like this:
>
> if (IS_ENABLED(CONFIG_X86_32))
> memblock_remove(PFN_PHYS(MAX_NONPAE_PFN), -1);
>
> Would that make the connection more obvious?
Yes, that's better. Here's the updated patch:
>From a235764221e4a849fa274a546ff2a3d9f15da2a9 Mon Sep 17 00:00:00 2001
From: "Mike Rapoport (Microsoft)" <rppt@...nel.org>
Date: Sun, 13 Apr 2025 10:36:17 +0300
Subject: [PATCH v2] x86/e820: discard high memory that can't be addressed by
32-bit systems
Dave Hansen reports the following crash on a 32-bit system with
CONFIG_HIGHMEM=y and CONFIG_X86_PAE=y:
> 0xf75fe000 is the mem_map[] entry for the first page >4GB. It
> obviously wasn't allocated, thus the oops.
BUG: unable to handle page fault for address: f75fe000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
*pdpt = 0000000002da2001 *pde = 000000000300c067 *pte = 0000000000000000
Oops: Oops: 0002 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc1-00288-ge618ee89561b-dirty #311 PREEMPT(undef)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
EIP: __free_pages_core+0x3c/0x74
Code: c3 d3 e6 83 ec 10 89 44 24 08 89 74 24 04 c7 04 24 c6 32 3a c2 89 55 f4 e8 a9 11 45 fe 85 f6 8b 55 f4 74 19 89 d8 31 c9 66 90 <0f> ba 30 0d c7 40 1c 00 00 00 00 41 83 c0 28 39 ce 75 ed 8b
EAX: f75fe000 EBX: f75fe000 ECX: 00000000 EDX: 0000000a
ESI: 00000400 EDI: 00500000 EBP: c247becc ESP: c247beb4
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00210046
CR0: 80050033 CR2: f75fe000 CR3: 02da6000 CR4: 000000b0
Call Trace:
memblock_free_pages+0x11/0x2c
memblock_free_all+0x2ce/0x3a0
mm_core_init+0xf5/0x320
start_kernel+0x296/0x79c
? set_init_arg+0x70/0x70
? load_ucode_bsp+0x13c/0x1a8
i386_start_kernel+0xad/0xb0
startup_32_smp+0x151/0x154
Modules linked in:
CR2: 00000000f75fe000
The mem_map[] is allocated up to the end of ZONE_HIGHMEM which is defined
by max_pfn.
Before 6faea3422e3b ("arch, mm: streamline HIGHMEM freeing") freeing of
high memory was also clamped to the end of ZONE_HIGHMEM but after
6faea3422e3b memblock_free_all() tries to free memory above the of
ZONE_HIGHMEM as well and that causes access to mem_map[] entries beyond
the end of the memory map.
Discard the memory after MAX_NONPAE_PFN from memblock on 32-bit systems
so that core MM would be aware only of actually usable memory.
Reported-by: Dave Hansen <dave.hansen@...el.com>
Tested-by: Arnd Bergmann <arnd@...nel.org>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@...nel.org>
---
arch/x86/kernel/e820.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 57120f0749cc..5e6b1034e6f1 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1300,6 +1300,13 @@ void __init e820__memblock_setup(void)
memblock_add(entry->addr, entry->size);
}
+ /*
+ * Discard memory above 4GB because 32-bit systems are limited to 4GB
+ * of memory even with HIGHMEM.
+ */
+ if (IS_ENABLED(CONFIG_X86_32))
+ memblock_remove(PFN_PHYS(MAX_NONPAE_PFN), -1);
+
/* Throw away partial pages: */
memblock_trim_memory(PAGE_SIZE);
--
2.47.2
--
Sincerely yours,
Mike.
Powered by blists - more mailing lists