linux-kernel - Re: [GIT PULL] MM updates for 6.19-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251211225947.822866-1-joshua.hahnjy@gmail.com>
Date: Thu, 11 Dec 2025 14:59:46 -0800
From: Joshua Hahn <joshua.hahnjy@...il.com>
To: Daniel Palmer <daniel@...f.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	linux-mm@...ck.org,
	linux-kernel@...r.kernel.org,
	mm-commits@...r.kernel.org
Subject: Re: [GIT PULL] MM updates for 6.19-rc1

On Thu, 11 Dec 2025 20:12:18 +0900 Daniel Palmer <daniel@...f.com> wrote:

> Hi Andrew,
> 
> On Thu, 4 Dec 2025 at 14:29, Andrew Morton <akpm@...ux-foundation.org> wrote:
> >       mm/page_alloc: prevent reporting pcp->batch = 0
> 
> I think, maybe, the following part of this patch broke nommu.
> 
> -       new_batch = max(1, zone_batchsize(zone));
> +       new_batch = zone_batchsize(zone);
> 
> Before this change on nommu zone_batchsize() returns 0 but the max()
> changes it to 1. Now it'll stay as 0 and anywhere that depends on it
> not being 0 won't work?

Hi Daniel,

Thank you for taking a look at this and finding that this was the source of
the deadlock. I took a look, it's definitely an issue. The problem is that
the patch gets rid of the max(1, zone_batchsize()) and handles the MMU case
by ensuring zone_batchsize never returns a value less than 1, but the
NOMMU case always returns 0.

I think your solution below works. I've also come up with a simler workaround
which doesn't change drain_pages_zone:

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d0f026ec10b6..9d638697cec8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5919,7 +5919,7 @@ static int zone_batchsize(struct zone *zone)
         * recycled, this leads to the once large chunks of space being
         * fragmented and becoming unavailable for high-order allocations.
         */
-       return 0;
+       return 1;
 #endif
 }

Would this be enough? Then we don't have to worry about handling zero values
from the callsites for NOMMU machines as well. But this has the opposite problem
that I was initially trying to fix, which is that NOMMU machines will now
report a batchsize of 1 in zone_pcp_init and print it out to dmesg, which
may be confusing for NOMMU users who expect there to be no batchsize. So
it totally makes sense for me to drop my original patch completely as well.
I'm not a NOMMU user so I am hoping to receive some feedback from folks who do
who can chime in on which approach is better.

> I'm seeing a deadlock on nommu:
> 
> https://lore.kernel.org/lkml/20251211102607.2538595-1-daniel@thingy.jp/

I would also like to take this opportunity to ask any NOMMU experts out there
about the apparent disagreement between the comment in zone_batchsize under the
NOMMU case, which suggests that NOMMU is harmed by batched freeing:

	/* The deferral and batching of frees should be suppressed under NOMMU
	 * conditions.

And returns 0 here which makes sense, only to artificially set it to 1 via
the max() later on and still do batching anyways by
<< CONFIG_PCP_BATCH_SCALE_MAX.

Thank you Daniel again for helping root cause this. Hopefully this fix works
to fix the deadlock you mentioned! Have a great day : -)
Joshua