[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251217051802.86144-1-joshua.hahnjy@gmail.com>
Date: Tue, 16 Dec 2025 21:18:02 -0800
From: Joshua Hahn <joshua.hahnjy@...il.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Daniel Palmer <daniel@...f.com>,
Matthew Wilcox <willy@...radead.org>,
Brendan Jackman <jackmanb@...gle.com>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...e.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Vlastimil Babka <vbabka@...e.cz>,
Zi Yan <ziy@...dia.com>,
linux-kernel@...r.kernel.org,
linux-mm@...ck.org,
kernel-team@...a.com
Subject: Re: [PATCH v4] mm/page_alloc: prevent reporting pcp->batch = 0
On Tue, 16 Dec 2025 10:22:05 -0800 Andrew Morton <akpm@...ux-foundation.org> wrote:
> On Tue, 16 Dec 2025 06:48:11 -0800 Joshua Hahn <joshua.hahnjy@...il.com> wrote:
>
> > zone_batchsize returns the appropriate value that should be used for
> > pcp->batch. If it finds a zone with less than 4096 pages or PAGE_SIZE >
> > 1M, however, it leads to some incorrect math.
> >
> > In the case above, we will get an intermediary value of 1, which is then
> > rounded down to the nearest power of two, and 1 is subtracted from it.
> > Since 1 is already a power of two, we will get batch = 1-1 = 0:
> >
> > batch = rounddown_pow_of_two(batch + batch/2) - 1;
> >
> > A pcp->batch value of 0 is nonsensical, for MMU systems. If this were
> > actually set, then functions like drain_zone_pages would become no-ops,
> > since they would free 0 pages at a time.
> >
> > Of the two callers of zone_batchsize, the one that is actually used to
> > set pcp->batch works around this by setting pcp->batch to the maximum
> > of 1 and zone_batchsize. However, the other caller, zone_pcp_init,
> > incorrectly prints out the batch size of the zone to be 0.
> >
> > This is probably rare in a typical zone, but the DMA zone can often have
> > less than 4096 pages, which means it will print out "LIFO batch:0".
> >
> > Before: [ 0.001216] DMA zone: 3998 pages, LIFO batch:0
> > After: [ 0.001210] DMA zone: 3998 pages, LIFO batch:1
> >
> > With all of this said, NOMMU differs in two ways. Semantically, it
> > should report that pcp->batch is 0. At the same time, it can never
> > really have a pcp->batch size of 0 since it will reach a deadlock in
> > pcp freeing functions. For this reason, zone_batchsize should still
> > report 0 for NOMMU, but zone_set_pageset_high_and_batch should still
> > interpret it as 1, meaning we cannot get rid of max(1, zone_batchsize())
> > in zone_set_pageset_high_and_batch.
> >
> > Suggested-by: Daniel Palmer <daniel@...f.com>
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@...il.com>
> > ---
> > Reviewers' note:
> >
> > This patch was originally a part of the 6.19-rc1 pr, but Daniel Palmer
> > kindly reported that this patch causes an issue on NOMMU systems [1].
> > Thank you, Daniel! I wasn't sure how to credit here since it was a
> > report on an unmerged commit so I went with suggested-by. If this is
> > problematic please let me know and I will change the tag.
> >
> > [1] https://lore.kernel.org/all/CAFr9PX=_HaM3_xPtTiBn5Gw5-0xcRpawpJ02NStfdr0khF2k7g@mail.gmail.com/
> >
> > Reviewer's note (to Andrew):
> >
> > This replaces commit 2/2 of the series titled "mm/page_alloc: pcp->batch
> > cleanups" [2].
>
> That series is in mainline. 2783088ef24e ("mm/page_alloc: prevent
> reporting pcp->batch = 0").
Hello Andrew,
Sorry again, this mix-up was also avoidable. I'll be more careful in the future.
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5888,8 +5888,8 @@ static int zone_batchsize(struct zone *zone)
> > * and zone lock contention.
> > */
> > batch = min(zone_managed_pages(zone) >> 12, SZ_256K / PAGE_SIZE);
> > - if (batch < 1)
> > - batch = 1;
> > + if (batch <= 1)
> > + return 1;
> >
> > /*
> > * Clamp the batch to a 2^n - 1 value. Having a power
> >
>
> So this doesn't work. Please send along a fix for current -linus and include
>
> Fixes: 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
>
> and Reported-by:Daniel and the appropriate Closes:
>
> Thanks.
Will do, thank you for your patience. I hope you have a great day!
Joshua
Powered by blists - more mailing lists