[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aMNJMFa5fDalFmtn@p100>
Date: Fri, 12 Sep 2025 00:12:00 +0200
From: Helge Deller <deller@...nel.org>
To: Toke Høiland-Jørgensen <toke@...hat.com>,
David Hildenbrand <david@...hat.com>,
Linux Kernel Development <linux-kernel@...r.kernel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
linux-parisc <linux-parisc@...r.kernel.org>
Cc: Christoph Biedl <linux-kernel.bfrz@...chmal.in-ulm.de>,
Helge Deller <deller@....de>
Subject: boot failure because of inaccurate page_pool_page_is_pp() on 32-bit
kernels
As reported earlier in this mail thread, all 32-bit Linux kernels since v6.16
fail to boot on the parisc architecture like this:
BUG: Bad page state in process swapper pfn:000f7
page: refcount:0 mapcount:0 mapping:00000000 index:0x0 pfn:0xf7
flags: 0x0(zone=0)
raw: 00000000 118022c0 118022c0 00000000 00000000 00000000 ffffffff 00000000
raw: 00000000
page dumped because: page_pool leak
Modules linked in:
CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc1-32bit+ #2730 NONE
Hardware name: 9000/778/B160L
Backtrace:
[<106ece88>] bad_page+0x14c/0x17c
[<10406c50>] free_page_is_bad.part.0+0xd4/0xec
[<106ed180>] free_page_is_bad+0x80/0x88
[<106ef05c>] __free_pages_ok+0x374/0x508
[<1011d34c>] __free_pages_core+0x1f0/0x218
[<1011a2f0>] memblock_free_pages+0x68/0x94
[<10120324>] memblock_free_all+0x26c/0x310
[<1011a4d8>] mm_core_init+0x18c/0x208
[<10100e88>] start_kernel+0x4ec/0x7a0
[<101054d0>] start_parisc+0xb4/0xc4
git bisecting leads to this patch which triggers the crash:
commit ee62ce7a1d909ccba0399680a03c2dee83bcae95
Author: Toke Høiland-Jørgensen <toke@...hat.com>
Date: Wed Apr 9 12:41:37 2025 +0200
page_pool: Track DMA-mapped pages and unmap them when destroying the pool
It turns out that the patch itself isn't wrong.
But it's the culprit which leads to the kernel bug since it modifies
PP_MAGIC_MASK for 32-bit kernels from:
-#define PP_MAGIC_MASK ~0x3UL
+#define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
Function page_pool_page_is_pp() needs to unambiguously identify page pool
pages (using PP_MAGIC_MASK), but since the patch now reduced the valid bits to
check in PP_MAGIC_MASK from 0xFFFFFFFC to 0xc000007c, the remaining bits are
not sufficient to unambiguously identify such pages any longer.
Because of that, page_pool_page_is_pp() sometimes wrongly reports pages as
page pool pages and as such triggers the kernel BUG as it believes it found a
page pool leak.
IMHO this is a generic 32-bit kernel issue, not just affecting parisc.
Do you see any options other than:
a) revert the patch (ee62ce7a1d90), or:
b) return false in page_pool_page_is_pp() when !defined(CONFIG_64BIT),
which means to effectively disable the page pool page test on 32bit
machines
Helge
Powered by blists - more mailing lists