lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251217051654.80355-1-joshua.hahnjy@gmail.com>
Date: Tue, 16 Dec 2025 21:16:54 -0800
From: Joshua Hahn <joshua.hahnjy@...il.com>
To: Guenter Roeck <linux@...ck-us.net>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	Dave Hansen <dave.hansen@...el.com>,
	Brendan Jackman <jackmanb@...gle.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Michal Hocko <mhocko@...e.com>,
	Suren Baghdasaryan <surenb@...gle.com>,
	Vlastimil Babka <vbabka@...e.cz>,
	Zi Yan <ziy@...dia.com>,
	linux-kernel@...r.kernel.org,
	linux-mm@...ck.org,
	kernel-team@...a.com,
	Geert Uytterhoeven <geert@...ux-m68k.org>,
	linux-m68k@...ts.linux-m68k.org
Subject: Re: [PATCH v2 2/2] mm/page_alloc: Prevent reporting pcp->batch = 0 [mcf5208evb boot failure]

On Tue, 16 Dec 2025 13:47:03 -0800 Guenter Roeck <linux@...ck-us.net> wrote:

> Hi,
> 
> On Thu, Oct 09, 2025 at 12:29:31PM -0700, Joshua Hahn wrote:
> > zone_batchsize returns the appropriate value that should be used for
> > pcp->batch. If it finds a zone with less than 4096 pages or PAGE_SIZE >
> > 1M, however, it leads to some incorrect math.
> > 
> > In the above case, we will get an intermediary value of 1, which is then
> > rounded down to the nearest power of two, and 1 is subtracted from it.
> > Since 1 is already a power of two, we will get batch = 1-1 = 0:
> > 
> > 	batch = rounddown_pow_of_two(batch + batch/2) - 1;
> > 
> > A pcp->batch value of 0 is nonsensical. If this were actually set, then
> > functions like drain_zone_pages would become no-ops, since they could
> > only free 0 pages at a time.
> > 
> > Of the two callers of zone_batchsize, the one that is actually used to
> > set pcp->batch works around this by setting pcp->batch to the maximum
> > of 1 and zone_batchsize. However, the other caller, zone_pcp_init,
> > incorrectly prints out the batch size of the zone to be 0.
> > 
> > This is probably rare in a typical zone, but the DMA zone can often have
> > less than 4096 pages, which means it will print out "LIFO batch:0".
> > 
> > Before: [    0.001216]   DMA zone: 3998 pages, LIFO batch:0
> > After:  [    0.001210]   DMA zone: 3998 pages, LIFO batch:1
> > 
> > Instead of dealing with the error handling and the mismatch between the
> > reported and actual zone batchsize, just return 1 if the zone_batchsize
> > is 1 page or less before the rounding.
> > 
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@...il.com>
> 
> With this patch in the tree, the qemu 'mcf5208evb' machine fails to boot
> with memory errors such as
> 
> S01syslogd: page allocation failure: order:7, mode:0xcc0(GFP_KERNEL), nodemask=(null)
> CPU: 0 UID: 0 PID: 34 Comm: S01syslogd Not tainted 6.19.0-rc1 #1 NONE
> Stack from 407d7ce0:
>         407d7ce0 403df960 403df960 00000000 00000001 00000007 40027c60 403df960
>         400c06be 00000cc0 00000001 407d7d7e 400bf614 407d7d34 403df3ba 407d7d14
>         407d7db8 400c0e5c 00000cc0 00000000 403df3ba 00000007 00000007 00000cc0
>         000d8000 00000018 0000006c 00000001 00000000 40fe6640 00000000 40fe81e4
>         403ffa40 4085eff4 00000000 00000400 00000000 001008c0 00000000 40854041
>         f4fe0000 00004041 f4fe0000 00000000 00010000 403ffa40 4085ed00 4085e800
> Call Trace: [<40027c60>] dump_stack+0xc/0x10
>  [<400c06be>] warn_alloc+0xdc/0x1bc
>  [<400bf614>] get_page_from_freelist+0x0/0xfa6
>  [<400c0e5c>] __alloc_frozen_pages_noprof+0x6be/0x8be
>  [<400c1358>] get_free_pages_noprof+0x16/0x3e
> 
> Reverting this patch fixes the problem.

Hi Guenter,

Thank you for the report. Daniel Palmer has identified an issue on NOMMU
systems, and I think this is caused by the same issue. It seems like
mcf5208evb is also NOMMU (arch/m68k/Kconfig.cpu shows config M520x depends on
!MMU), so I imagine this is the same issue that was reported.

Andrew let me know that the commit has already been committed to mainline
so I'll be sending up a fix shortly. Sorry about the problem, and thank you
again for reporting it. I hope you have a great day!
Joshua

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ