[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251216142755.2857688-1-joshua.hahnjy@gmail.com>
Date: Tue, 16 Dec 2025 06:27:53 -0800
From: Joshua Hahn <joshua.hahnjy@...il.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Daniel Palmer <daniel@...f.com>,
Brendan Jackman <jackmanb@...gle.com>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...e.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Vlastimil Babka <vbabka@...e.cz>,
Zi Yan <ziy@...dia.com>,
linux-kernel@...r.kernel.org,
linux-mm@...ck.org,
kernel-team@...a.com
Subject: [PATCH v3] mm/page_alloc: prevent reporting pcp->batch = 0
zone_batchsize returns the appropriate value that should be used for
pcp->batch. If it finds a zone with less than 4096 pages or PAGE_SIZE >
1M, however, it leads to some incorrect math.
In the case above, we will get an intermediary value of 1, which is then
rounded down to the nearest power of two, and 1 is subtracted from it.
Since 1 is already a power of two, we will get batch = 1-1 = 0:
batch = rounddown_pow_of_two(batch + batch/2) - 1;
A pcp->batch value of 0 is nonsensical, for MMU systems. If this were
actually set, then functions like drain_zone_pages would become no-ops,
since they would free 0 pages at a time.
Of the two callers of zone_batchsize, the one that is actually used to
set pcp->batch works around this by setting pcp->batch to the maximum
of 1 and zone_batchsize. However, the other caller, zone_pcp_init,
incorrectly prints out the batch size of the zone to be 0.
This is probably rare in a typical zone, but the DMA zone can often have
less than 4096 pages, which means it will print out "LIFO batch:0".
Before: [ 0.001216] DMA zone: 3998 pages, LIFO batch:0
After: [ 0.001210] DMA zone: 3998 pages, LIFO batch:1
With all of this said, NOMMU differs in two ways. Semantically, it
should report that pcp->batch is 0. At the same time, it can never
really have a pcp->batch size of 0 since it will reach a deadlock in
pcp freeing functions. For this reason, zone_batchsize should still
report 0 for NOMMU, but zone_set_pageset_high_and_batch should still
interpret it as 1, meaning we cannot get rid of max(1, zone_batchsize())
in zone_set_pageset_high_and_batch.
Suggested-by: Daniel Palmer <daniel@...f.com>
Signed-off-by: Joshua Hahn <joshua.hahnjy@...il.com>
---
Reviewers' note:
This patch was originally a part of the 6.19-rc1 pr, but Daniel Palmer
kindly reported that this patch causes an issue on NOMMU systems [1].
Thank you, Daniel! I wasn't sure how to credit here since it was a
report on an unmerged commit so I went with suggested-by. If this is
problematic please let me know and I will change the tag.
[1] https://lore.kernel.org/all/CAFr9PX=_HaM3_xPtTiBn5Gw5-0xcRpawpJ02NStfdr0khF2k7g@mail.gmail.com/
Reviewer's note (to Andrew):
This replaces commit 2/2 of the series titled "mm/page_alloc: pcp->batch
cleanups" [2]. I thought it might be best to just send out this patch and not
the first since the other one is already in the pull request (and seemingly
without any issues). If you would like me to resubmit the series, please
let me know and I will do that!
[2] https://lore.kernel.org/all/20251009192933.3756712-3-joshua.hahnjy@gmail.com/
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f928d37eeb6a..95172f4610ff 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5888,7 +5888,7 @@ static int zone_batchsize(struct zone *zone)
* and zone lock contention.
*/
batch = min(zone_managed_pages(zone) >> 12, SZ_256K / PAGE_SIZE);
- if (batch < 1)
+ if (batch <= 1)
batch = 1;
/*
base-commit: e0669bdbba170f6c1a7bf5763f72df3f58a4945c
--
2.47.3
Powered by blists - more mailing lists