[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250323190647.GA1009914@ax162>
Date: Sun, 23 Mar 2025 14:06:47 -0500
From: Nathan Chancellor <nathan@...nel.org>
To: Mike Rapoport <rppt@...nel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Jiaxun Yang <jiaxun.yang@...goat.com>,
Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
linux-kernel@...r.kernel.org, linux-mips@...r.kernel.org,
linux-arch@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v2 11/13] arch, mm: streamline HIGHMEM freeing
Hi Mike,
On Thu, Mar 13, 2025 at 03:50:01PM +0200, Mike Rapoport wrote:
> From: "Mike Rapoport (Microsoft)" <rppt@...nel.org>
>
> All architectures that support HIGHMEM have their code that frees high
> memory pages to the buddy allocator while __free_memory_core() is limited
> to freeing only low memory.
>
> There is no actual reason for that. The memory map is completely ready
> by the time memblock_free_all() is called and high pages can be released to
> the buddy allocator along with low memory.
>
> Remove low memory limit from __free_memory_core() and drop per-architecture
> code that frees high memory pages.
>
> Acked-by: Dave Hansen <dave.hansen@...ux.intel.com> # x86
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@...nel.org>
...
> diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
> index ed9dde6a00f7..075177e817ac 100644
> --- a/arch/mips/mm/init.c
> +++ b/arch/mips/mm/init.c
> @@ -425,25 +425,6 @@ void __init paging_init(void)
> static struct kcore_list kcore_kseg0;
> #endif
>
> -static inline void __init mem_init_free_highmem(void)
> -{
> -#ifdef CONFIG_HIGHMEM
> - unsigned long tmp;
> -
> - if (cpu_has_dc_aliases)
> - return;
> -
> - for (tmp = highstart_pfn; tmp < highend_pfn; tmp++) {
> - struct page *page = pfn_to_page(tmp);
> -
> - if (!memblock_is_memory(PFN_PHYS(tmp)))
> - SetPageReserved(page);
> - else
> - free_highmem_page(page);
> - }
> -#endif
> -}
> -
> void __init mem_init(void)
> {
> /*
> @@ -454,7 +435,6 @@ void __init mem_init(void)
>
> maar_init();
> setup_zero_pages(); /* Setup zeroed pages. */
> - mem_init_free_highmem();
> memblock_free_all();
>
> #ifdef CONFIG_64BIT
...
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index fdf20503850e..6fccd3b3248c 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3172,7 +3172,6 @@ extern void reserve_bootmem_region(phys_addr_t start,
>
> /* Free the reserved page into the buddy system, so it gets managed. */
> void free_reserved_page(struct page *page);
> -#define free_highmem_page(page) free_reserved_page(page)
>
> static inline void mark_page_reserved(struct page *page)
> {
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 95af35fd1389..64ae678cd1d1 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -2164,8 +2164,7 @@ static unsigned long __init __free_memory_core(phys_addr_t start,
> phys_addr_t end)
> {
> unsigned long start_pfn = PFN_UP(start);
> - unsigned long end_pfn = min_t(unsigned long,
> - PFN_DOWN(end), max_low_pfn);
> + unsigned long end_pfn = PFN_DOWN(end);
>
> if (start_pfn >= end_pfn)
> return 0;
I bisected a crash that I see to this change as commit 6faea3422e3b
("arch, mm: streamline HIGHMEM freeing") in -next.
$ cat arch/mips/configs/repro.config
CONFIG_RELOCATABLE=y
CONFIG_RELOCATION_TABLE_SIZE=0x00200000
CONFIG_RANDOMIZE_BASE=y
$ make -skj"$(nproc)" ARCH=mips CROSS_COMPILE=mips-linux- mrproper malta_defconfig repro.config vmlinux
$ qemu-system-mipsel \
-display none \
-nodefaults \
-cpu 24Kf \
-machine malta \
-kernel vmlinux \
-initrd rootfs.cpio \
-m 512m \
-serial mon:stdio
Linux version 6.14.0-rc6-00359-g6faea3422e3b (nathan@...62) (mips-linux-gcc (GCC) 14.2.0, GNU ld (GNU Binutils) 2.42) #1 SMP Fri Mar 21 08:12:02 MST 2025
earlycon: uart8250 at I/O port 0x3f8 (options '38400n8')
printk: legacy bootconsole [uart8250] enabled
Config serial console: console=ttyS0,38400n8r
CPU0 revision is: 00019300 (MIPS 24Kc)
FPU revision is: 00739300
MIPS: machine is mti,malta
Software DMA cache coherency enabled
Initial ramdisk at: 0x8fad0000 (5360128 bytes)
OF: reserved mem: Reserved memory: No reserved-memory node in the DT
Primary instruction cache 2kB, VIPT, 2-way, linesize 16 bytes.
Primary data cache 2kB, 2-way, VIPT, no aliases, linesize 16 bytes
Zone ranges:
DMA [mem 0x0000000000000000-0x0000000000ffffff]
Normal [mem 0x0000000001000000-0x000000001fffffff]
Movable zone start for each node
Early memory node ranges
node 0: [mem 0x0000000000000000-0x000000000fffffff]
node 0: [mem 0x0000000090000000-0x000000009fffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x000000009fffffff]
On node 0, zone Normal: 16384 pages in unavailable ranges
random: crng init done
percpu: Embedded 3 pages/cpu s18832 r8192 d22128 u49152
Kernel command line: rd_start=0xffffffff8fad0000 rd_size=5360128 console=ttyS0,38400n8r
printk: log buffer data + meta data: 32768 + 102400 = 135168 bytes
Dentry cache hash table entries: 65536 (order: 4, 262144 bytes, linear)
Inode-cache hash table entries: 32768 (order: 3, 131072 bytes, linear)
Writing ErrCtl register=00000000
Readback ErrCtl register=00000000
Built 1 zonelists, mobility grouping on. Total pages: 16384
mem auto-init: stack:all(zero), heap alloc:off, heap free:off
Unhandled kernel unaligned access[#1]:
CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.14.0-rc6-00359-g6faea3422e3b #1
Hardware name: mti,malta
$ 0 : 00000000 00000001 81cb0880 00129027
$ 4 : 00000001 0000000a 00000002 00129026
$ 8 : ffffdfff 80101e00 00000002 00000000
$12 : 81c9c224 81c63e68 00000002 00000000
$16 : 805b1e00 00025800 81cb0880 00000002
$20 : 00000000 81c63e64 0000000a 81f10000
$24 : 81c63e64 81c63e60
$28 : 81c60000 81c63de0 00000001 81cc9d20
Hi : 00000000
Lo : 00000000
epc : 814a227c __free_pages_ok+0x144/0x3c0
ra : 81cc9d20 memblock_free_all+0x1d4/0x27c
Status: 10000002 KERNEL EXL
Cause : 00800410 (ExcCode 04)
BadVA : 00129026
PrId : 00019300 (MIPS 24Kc)
Modules linked in:
Process swapper (pid: 0, threadinfo=(ptrval), task=(ptrval), tls=00000000)
Stack : 81f10000 805a9e00 81c80000 00000000 00000002 814aa240 000003ff 00000400
00000000 81f10000 81c9c224 00003b1f 81c80000 81c63e60 81ca0000 81c63e64
81f10000 0000000a 0000001f 81cc9d20 81f10000 81cc96d8 00000000 81c80000
81c9c224 81c63e60 81c63e64 00000000 81f10000 00024000 00028000 00025c00
90000000 a0000000 00000002 00000017 00000000 00000000 81f10000 81f10000
...
Call Trace:
[<814a227c>] __free_pages_ok+0x144/0x3c0
[<81cc9d20>] memblock_free_all+0x1d4/0x27c
[<81cc6764>] mm_core_init+0x100/0x138
[<81cb4ba4>] start_kernel+0x4a0/0x6e4
Code: 1080ffd5 02003825 2467ffff <8ce30000> 7c630500 1060ffd4 00000000 8ce30000 7c630180
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
At the immediate parent of that change, the boot completes fine.
Linux version 6.14.0-rc6-00358-ge120d1bc12da (nathan@...62) (mips-linux-gcc (GCC) 14.2.0, GNU ld (GNU Binutils) 2.42) #1 SMP Sun Mar 23 13:57:15 CDT 2025
earlycon: uart8250 at I/O port 0x3f8 (options '38400n8')
printk: legacy bootconsole [uart8250] enabled
Config serial console: console=ttyS0,38400n8r
CPU0 revision is: 00019300 (MIPS 24Kc)
FPU revision is: 00739300
MIPS: machine is mti,malta
Software DMA cache coherency enabled
Initial ramdisk at: 0x8fad0000 (5360128 bytes)
OF: reserved mem: Reserved memory: No reserved-memory node in the DT
Primary instruction cache 2kB, VIPT, 2-way, linesize 16 bytes.
Primary data cache 2kB, 2-way, VIPT, no aliases, linesize 16 bytes
Zone ranges:
DMA [mem 0x0000000000000000-0x0000000000ffffff]
Normal [mem 0x0000000001000000-0x000000001fffffff]
Movable zone start for each node
Early memory node ranges
node 0: [mem 0x0000000000000000-0x000000000fffffff]
node 0: [mem 0x0000000090000000-0x000000009fffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x000000009fffffff]
On node 0, zone Normal: 16384 pages in unavailable ranges
random: crng init done
percpu: Embedded 3 pages/cpu s18832 r8192 d22128 u49152
Kernel command line: rd_start=0xffffffff8fad0000 rd_size=5360128 console=ttyS0,38400n8r
printk: log buffer data + meta data: 32768 + 102400 = 135168 bytes
Dentry cache hash table entries: 65536 (order: 4, 262144 bytes, linear)
Inode-cache hash table entries: 32768 (order: 3, 131072 bytes, linear)
Writing ErrCtl register=00000000
Readback ErrCtl register=00000000
Built 1 zonelists, mobility grouping on. Total pages: 16384
mem auto-init: stack:all(zero), heap alloc:off, heap free:off
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
rcu: Hierarchical RCU implementation.
rcu: RCU event tracing is enabled.
rcu: RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=1.
rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
NR_IRQS: 256
rcu: srcu_init: Setting srcu_struct sizes based on contention.
CPU frequency 320.00 MHz
clocksource: MIPS: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 11945390257 ns
sched_clock: 32 bits at 160MHz, resolution 6ns, wraps every 13421787132ns
Console: colour dummy device 80x25
Calibrating delay loop... 1895.62 BogoMIPS (lpj=9478144)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 4096 (order: 0, 16384 bytes, linear)
Mountpoint-cache hash table entries: 4096 (order: 0, 16384 bytes, linear)
rcu: Hierarchical SRCU implementation.
rcu: Max phase no-delay instances is 1000.
smp: Bringing up secondary CPUs ...
smp: Brought up 1 node, 1 CPU
Memory: 241200K/262144K available (7844K kernel code, 330K rwdata, 1424K rodata, 2352K init, 224K bss, 19984K reserved, 0K cma-reserved)
devtmpfs: initialized
...
If there is any additional information I can provide or patches I can
test, I am more than happy to do so.
Cheers,
Nathan
Powered by blists - more mailing lists