lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aMNJMFa5fDalFmtn@p100>
Date: Fri, 12 Sep 2025 00:12:00 +0200
From: Helge Deller <deller@...nel.org>
To: Toke Høiland-Jørgensen <toke@...hat.com>,
	David Hildenbrand <david@...hat.com>,
	Linux Kernel Development <linux-kernel@...r.kernel.org>,
	Linux Memory Management List <linux-mm@...ck.org>,
	linux-parisc <linux-parisc@...r.kernel.org>
Cc: Christoph Biedl <linux-kernel.bfrz@...chmal.in-ulm.de>,
	Helge Deller <deller@....de>
Subject: boot failure because of inaccurate page_pool_page_is_pp() on 32-bit
 kernels

As reported earlier in this mail thread, all 32-bit Linux kernels since v6.16
fail to boot on the parisc architecture like this:

 BUG: Bad page state in process swapper  pfn:000f7
 page: refcount:0 mapcount:0 mapping:00000000 index:0x0 pfn:0xf7
 flags: 0x0(zone=0)
 raw: 00000000 118022c0 118022c0 00000000 00000000 00000000 ffffffff 00000000
 raw: 00000000
 page dumped because: page_pool leak
 Modules linked in:
 CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc1-32bit+ #2730 NONE
 Hardware name: 9000/778/B160L
 Backtrace:
  [<106ece88>] bad_page+0x14c/0x17c
  [<10406c50>] free_page_is_bad.part.0+0xd4/0xec
  [<106ed180>] free_page_is_bad+0x80/0x88
  [<106ef05c>] __free_pages_ok+0x374/0x508
  [<1011d34c>] __free_pages_core+0x1f0/0x218
  [<1011a2f0>] memblock_free_pages+0x68/0x94
  [<10120324>] memblock_free_all+0x26c/0x310
  [<1011a4d8>] mm_core_init+0x18c/0x208
  [<10100e88>] start_kernel+0x4ec/0x7a0
  [<101054d0>] start_parisc+0xb4/0xc4

git bisecting leads to this patch which triggers the crash:

 commit ee62ce7a1d909ccba0399680a03c2dee83bcae95
 Author: Toke Høiland-Jørgensen <toke@...hat.com>
 Date:   Wed Apr 9 12:41:37 2025 +0200
    page_pool: Track DMA-mapped pages and unmap them when destroying the pool

It turns out that the patch itself isn't wrong.

But it's the culprit which leads to the kernel bug since it modifies
PP_MAGIC_MASK for 32-bit kernels from:

-#define PP_MAGIC_MASK ~0x3UL
+#define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)

Function page_pool_page_is_pp() needs to unambiguously identify page pool
pages (using PP_MAGIC_MASK), but since the patch now reduced the valid bits to
check in PP_MAGIC_MASK from 0xFFFFFFFC to 0xc000007c, the remaining bits are
not sufficient to unambiguously identify such pages any longer.

Because of that, page_pool_page_is_pp() sometimes wrongly reports pages as
page pool pages and as such triggers the kernel BUG as it believes it found a
page pool leak.

IMHO this is a generic 32-bit kernel issue, not just affecting parisc.

Do you see any options other than:
a) revert the patch (ee62ce7a1d90), or:
b) return false in page_pool_page_is_pp() when !defined(CONFIG_64BIT),
   which means to effectively disable the page pool page test on 32bit
   machines

Helge

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ