[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <42143500-c380-41fe-815c-696c17241506@roeck-us.net>
Date: Tue, 16 Dec 2025 13:47:03 -0800
From: Guenter Roeck <linux@...ck-us.net>
To: Joshua Hahn <joshua.hahnjy@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Dave Hansen <dave.hansen@...el.com>,
Brendan Jackman <jackmanb@...gle.com>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...e.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Vlastimil Babka <vbabka@...e.cz>, Zi Yan <ziy@...dia.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
kernel-team@...a.com, Geert Uytterhoeven <geert@...ux-m68k.org>,
linux-m68k@...ts.linux-m68k.org
Subject: Re: [PATCH v2 2/2] mm/page_alloc: Prevent reporting pcp->batch = 0
[mcf5208evb boot failure]
Hi,
On Thu, Oct 09, 2025 at 12:29:31PM -0700, Joshua Hahn wrote:
> zone_batchsize returns the appropriate value that should be used for
> pcp->batch. If it finds a zone with less than 4096 pages or PAGE_SIZE >
> 1M, however, it leads to some incorrect math.
>
> In the above case, we will get an intermediary value of 1, which is then
> rounded down to the nearest power of two, and 1 is subtracted from it.
> Since 1 is already a power of two, we will get batch = 1-1 = 0:
>
> batch = rounddown_pow_of_two(batch + batch/2) - 1;
>
> A pcp->batch value of 0 is nonsensical. If this were actually set, then
> functions like drain_zone_pages would become no-ops, since they could
> only free 0 pages at a time.
>
> Of the two callers of zone_batchsize, the one that is actually used to
> set pcp->batch works around this by setting pcp->batch to the maximum
> of 1 and zone_batchsize. However, the other caller, zone_pcp_init,
> incorrectly prints out the batch size of the zone to be 0.
>
> This is probably rare in a typical zone, but the DMA zone can often have
> less than 4096 pages, which means it will print out "LIFO batch:0".
>
> Before: [ 0.001216] DMA zone: 3998 pages, LIFO batch:0
> After: [ 0.001210] DMA zone: 3998 pages, LIFO batch:1
>
> Instead of dealing with the error handling and the mismatch between the
> reported and actual zone batchsize, just return 1 if the zone_batchsize
> is 1 page or less before the rounding.
>
> Signed-off-by: Joshua Hahn <joshua.hahnjy@...il.com>
With this patch in the tree, the qemu 'mcf5208evb' machine fails to boot
with memory errors such as
S01syslogd: page allocation failure: order:7, mode:0xcc0(GFP_KERNEL), nodemask=(null)
CPU: 0 UID: 0 PID: 34 Comm: S01syslogd Not tainted 6.19.0-rc1 #1 NONE
Stack from 407d7ce0:
407d7ce0 403df960 403df960 00000000 00000001 00000007 40027c60 403df960
400c06be 00000cc0 00000001 407d7d7e 400bf614 407d7d34 403df3ba 407d7d14
407d7db8 400c0e5c 00000cc0 00000000 403df3ba 00000007 00000007 00000cc0
000d8000 00000018 0000006c 00000001 00000000 40fe6640 00000000 40fe81e4
403ffa40 4085eff4 00000000 00000400 00000000 001008c0 00000000 40854041
f4fe0000 00004041 f4fe0000 00000000 00010000 403ffa40 4085ed00 4085e800
Call Trace: [<40027c60>] dump_stack+0xc/0x10
[<400c06be>] warn_alloc+0xdc/0x1bc
[<400bf614>] get_page_from_freelist+0x0/0xfa6
[<400c0e5c>] __alloc_frozen_pages_noprof+0x6be/0x8be
[<400c1358>] get_free_pages_noprof+0x16/0x3e
Reverting this patch fixes the problem.
Bisect log is attached for reference.
Guenter
---
# bad: [416f99c3b16f582a3fc6d64a1f77f39d94b76de5] Merge tag 'driver-core-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
# good: [559e608c46553c107dbba19dae0854af7b219400] Merge tag 'ntfs3_for_6.19' of https://github.com/Paragon-Software-Group/linux-ntfs3
git bisect start '416f99c3b16f' '559e608c4655'
# good: [fa5ef105618ae9b5aaa51b3f09e41d88d4514207] Merge tag 'spi-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
git bisect good fa5ef105618ae9b5aaa51b3f09e41d88d4514207
# bad: [399ead3a6d76cbdd29a716660db5c84a314dab70] Merge tag 'uml-for-linux-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
git bisect bad 399ead3a6d76cbdd29a716660db5c84a314dab70
# good: [a3ebb59eee2e558e8f8f27fc3f75cd367f17cd8e] Merge tag 'vfio-v6.19-rc1' of https://github.com/awilliam/linux-vfio
git bisect good a3ebb59eee2e558e8f8f27fc3f75cd367f17cd8e
# bad: [faf3c923523e5c8fc3baaa413d62e913774ae52f] mm: fix vma_start_write_killable() signal handling
git bisect bad faf3c923523e5c8fc3baaa413d62e913774ae52f
# bad: [915a2453d824a9b6bf724e3f970d86ae1d092a61] mm/damon/tests/core-kunit: handle alloc failure on damon_test_set_attrs()
git bisect bad 915a2453d824a9b6bf724e3f970d86ae1d092a61
# bad: [c707a68f9468e4ef4a3546b636a9dd088fe7b7f1] mm: abstract io_remap_pfn_range() based on PFN
git bisect bad c707a68f9468e4ef4a3546b636a9dd088fe7b7f1
# bad: [ca30ac479e6cf7a210dcad32fa2ee99ca0357e91] mm/page_owner: simplify zone iteration logic in init_early_allocated_pages()
git bisect bad ca30ac479e6cf7a210dcad32fa2ee99ca0357e91
# good: [138336d674d2e51f1e5699d2a30af1e9aa1352b4] mm/zswap: remove unnecessary dlen writes for incompressible pages
git bisect good 138336d674d2e51f1e5699d2a30af1e9aa1352b4
# good: [0de9a442eeba4a6435af74120822b10b12ab8449] mm/page_owner: update Documentation with 'show_handles' and 'show_stacks_handles'
git bisect good 0de9a442eeba4a6435af74120822b10b12ab8449
# bad: [95b34d66480bbc9bc31e78c26b1d5be47358ffc0] mm: always call rmap_walk() on locked folios
git bisect bad 95b34d66480bbc9bc31e78c26b1d5be47358ffc0
# bad: [2783088ef24e32df9d70eb2a24f70de28b476a05] mm/page_alloc: prevent reporting pcp->batch = 0
git bisect bad 2783088ef24e32df9d70eb2a24f70de28b476a05
# good: [4dcf65bf5be22e32d389628b0e655731f97f525e] mm/page_alloc: clarify batch tuning in zone_batchsize
git bisect good 4dcf65bf5be22e32d389628b0e655731f97f525e
# first bad commit: [2783088ef24e32df9d70eb2a24f70de28b476a05] mm/page_alloc: prevent reporting pcp->batch = 0
Powered by blists - more mailing lists