lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALe3CaAehCC6WOpCAGtMX3qsTqMc8jh3kn1Fz_m7_7_M6SMgfQ@mail.gmail.com>
Date: Fri, 25 Oct 2024 16:19:31 +0800
From: Su Hua <suhua.tanke@...il.com>
To: Mike Rapoport <rppt@...nel.org>
Cc: Stephen Rothwell <sfr@...b.auug.org.au>, 
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, 
	Linux Next Mailing List <linux-next@...r.kernel.org>
Subject: Re: linux-next: boot failure after merge of the memblock tree

Appreciate everyone.

Mike Rapoport <rppt@...nel.org> 于2024年10月25日周五 14:57写道:
>
> Hi Stephen,
>
> On Tue, Oct 22, 2024 at 05:39:21PM +1100, Stephen Rothwell wrote:
> > Hi all,
> >
> > After merging the memblock tree, today's linux-next build
> > (powerpc_pseries_le_defconfig) failed my qemu boot test like this:
> >
> > mem auto-init: stack:all(zero), heap alloc:off, heap free:off
> > BUG: Unable to handle kernel data access on read at 0x00001878
> > Faulting instruction address: 0xc0000000004f00e4
> > Oops: Kernel access of bad area, sig: 7 [#1]
> > LE PAGE_SIZE=4K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> > Modules linked in:
> > CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.12.0-rc4-06078-g367eaba2691a #1
> > Hardware name: IBM pSeries (emulated by qemu) POWER10 (architected) 0x801200 0xf000006 of:SLOF,HEAD pSeries
> > NIP:  c0000000004f00e4 LR: c000000000489df8 CTR: 0000000000000000
> > REGS: c0000000028cfae0 TRAP: 0300   Not tainted  (6.12.0-rc4-06078-g367eaba2691a)
> > MSR:  8000000002001033 <SF,VEC,ME,IR,DR,RI,LE>  CR: 84000240  XER: 00000000
> > CFAR: c0000000004f2c48 DAR: 0000000000001878 DSISR: 00080000 IRQMASK: 3
> > GPR00: c00000000204994c c0000000028cfd80 c0000000016a4300 c00c000000040000
> > GPR04: 0000000000000001 0000000000001000 0000000000000007 c000000002a11178
> > GPR08: 0000000000000000 0000000000001800 c00000007fffe720 0000000000002001
> > GPR12: 0000000000000000 c000000002a6a000 0000000000000000 00000000018855c0
> > GPR16: c000000002940270 c00c000000000000 0000000000040000 0000000000000000
> > GPR20: 0000000000000000 ffffffffffffffff 0000000000000001 ffffffffffffffff
> > GPR24: 00c0000000000000 0000000000000000 0000000000000000 0000000008000000
> > GPR28: 0000000000000000 0000000000002a6b 0000000000000000 0000000000001000
> > NIP [c0000000004f00e4] set_pfnblock_flags_mask+0x74/0x140
> > LR [c000000000489df8] reserve_bootmem_region+0x2a8/0x2c0
> > Call Trace:
> > c0000000028cfd80] [c0000000028cfdd0] 0xc0000000028cfdd0 (unreliable)
> > c0000000028cfe20] [c00000000204994c] memblock_free_all+0x144/0x2d0
> > c0000000028cfea0] [c000000002016354] mem_init+0x5c/0x70
> > c0000000028cfec0] [c00000000204547c] mm_core_init+0x158/0x1dc
> > c0000000028cff30] [c000000002004350] start_kernel+0x608/0x944
> > c0000000028cffe0] [c00000000000e99c] start_here_common+0x1c/0x20
> > Code: 4182000c 79082d28 7d4a4214 e9230000 3d020137 38e8ce78 79284620 792957a0 79081f24 79295d24 7d07402a 7d284a14 <e9090078> 7c254040 41800094 e9290088
> > ---[ end trace 0000000000000000 ]---
> >
> > Kernel panic - not syncing: Attempted to kill the idle task!
> >
> > Caused by commit
> >
> >   ad48825232a9 ("memblock: uniformly initialize all reserved pages to MIGRATE_MOVABLE")
> >
> > I bisected the failure to this commit and have reverted it for today.
>
> Apparently set_pfnblock_flags_mask() is unhappy when called for
> uninitialized struct page. With the patch below
>
> qemu-system-ppc64el -M pseries -cpu power10 -smp 16 -m 32G -vga none -nographic -kernel $KERNEL
>
> boots up to mounting root filesystem.
>
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 49dbd30e71ad..2395970314e7 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -723,10 +723,10 @@ static void __meminit init_reserved_page(unsigned long pfn, int nid)
>                         break;
>         }
>
> +       __init_single_page(pfn_to_page(pfn), pfn, zid, nid);
> +
>         if (pageblock_aligned(pfn))
>                 set_pageblock_migratetype(pfn_to_page(pfn), MIGRATE_MOVABLE);
> -
> -       __init_single_page(pfn_to_page(pfn), pfn, zid, nid);

Indeed, when #ifdef NODE_NOT_IN_PAGE_FLAGS is defined, there is no
problem, and this is why my
test environment did not reveal any issues. However, when
NODE_NOT_IN_PAGE_FLAGS is not defined,
page_to_nid needs to use page->flags to get the node ID, which depends
on __init_single_page for initialization.

>  }
>  #else
>  static inline void pgdat_set_deferred_range(pg_data_t *pgdat) {}
>
> > --
> > Cheers,
> > Stephen Rothwell
>
>
>
> --
> Sincerely yours,
> Mike.

Sincerely yours,
Su

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ