[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2bcb018c8b237f7ab2356f4459e14ae81a6fec8b.camel@physik.fu-berlin.de>
Date: Mon, 11 Aug 2025 00:20:18 +0200
From: John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>
To: Anthony Yznaga <anthony.yznaga@...cle.com>, sparclinux@...r.kernel.org,
davem@...emloft.net, andreas@...sler.com
Cc: linux-kernel@...r.kernel.org, agordeev@...ux.ibm.com, will@...nel.org,
ryan.roberts@....com, david@...hat.com, osalvador@...e.de
Subject: Re: [PATCH] sparc64: fix hugetlb for sun4u
Hi,
On Sun, 2025-08-10 at 11:52 +0200, John Paul Adrian Glaubitz wrote:
> On Sat, 2025-08-09 at 08:42 +0200, John Paul Adrian Glaubitz wrote:
> > Let me know if you have more suggestions to test. I can also provide you with full
> > access to this Netra 240 if you send me your public SSH key in a private mail.
>
> I have narrowed it down to a regression between v6.3 and v6.4 now.
>
> The bug can be reproduced with the sparc64_defconfig on a Sun Netra 240 by setting
> CONFIG_TRANSPARENT_HUGEPAGE=y and CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y. When testing
> on a modern systemd-based distribution, it's also necessary to enable CGroup support
> as well as enable support for Sun partition tables with CONFIG_SUN_PARTITION=y.
>
> Then it should be a matter of bisecting the commits between v6.3 and v6.4.
>
> I will do that within the next days as I'm currently a bit busy with other stuff.
OK, it turns out it's reproducible on older kernels (but not as old as 4.19) as well.
It's just much harder to trigger. I found a reproducer though and will try to find
the problematic commit next.
[50686.808389] BUG: Bad page map in process sshd-session pte:00000002 pmd:01448000
[50686.905701] addr:00000100000a0000 vm_flags:00000075 anon_vma:0000000000000000 mapping:fff000003c8ca4f8 index:50
[50687.038425] file:sshd-session fault:filemap_fault mmap:ext4_file_mmap [ext4] read_folio:ext4_read_folio [ext4]
[50687.170246] CPU: 0 PID: 37883 Comm: sshd-session Not tainted 6.3.0-2-sparc64 #1 Debian 6.3.11-1
[50687.285751] Call Trace:
[50687.317771] [<0000000000d660b0>] dump_stack+0x8/0x18
[50687.382976] [<000000000064fd1c>] print_bad_pte+0x15c/0x200
[50687.455024] [<0000000000650f84>] unmap_page_range+0x3e4/0xbe0
[50687.530513] [<0000000000651cd8>] unmap_vmas+0xf8/0x1a0
[50687.597993] [<000000000065e674>] exit_mmap+0xb4/0x360
[50687.664331] [<00000000004647dc>] __mmput+0x3c/0x120
[50687.728380] [<00000000004648f4>] mmput+0x34/0x60
[50687.788999] [<000000000046b510>] do_exit+0x250/0xa00
[50687.854194] [<000000000046bea4>] do_group_exit+0x24/0xa0
[50687.923962] [<000000000046bf3c>] sys_exit_group+0x1c/0x40
[50687.994875] [<0000000000406174>] linux_sparc_syscall+0x34/0x44
[50688.071518] Disabling lock debugging due to kernel taint
[50689.484196] Unable to handle kernel paging request at virtual address 000c000002400000
[50689.588368] tsk->{mm,active_mm}->context = 00000000001815a6
[50689.661677] tsk->{mm,active_mm}->pgd = fff000000ae60000
[50689.730374] \|/ ____ \|/
"@'/ .. \`@"
/_| \__/ |_\
\__U_/
[50689.923679] sshd-session(37883): Oops [#1]
[50689.977420] CPU: 0 PID: 37883 Comm: sshd-session Tainted: G B 6.3.0-2-sparc64 #1 Debian 6.3.11-1
[50690.112384] TSTATE: 0000008811001607 TPC: 00000000006510cc TNPC: 00000000006510d0 Y: 00000000 Tainted: G B
[50690.261089] TPC: <unmap_page_range+0x52c/0xbe0>
[50690.320650] g0: 00000000000004a8 g1: 000c000000000000 g2: 0000000000008800 g3: ffffffffffffffff
[50690.435029] g4: fff0000001ef1280 g5: 0000000031200000 g6: fff0000001f04000 g7: ffffffffffffffff
[50690.549403] o0: 000c000002400a20 o1: 00000100000a4000 o2: 0000000100048290 o3: 0000000000000000
[50690.663779] o4: 0000000000000001 o5: 000000000000000d sp: fff0000001f06f61 ret_pc: 0000010000000000
[50690.782728] RPC: <0x10000000000>
[50690.825039] l0: 0000000100048290 l1: 000c000002400a20 l2: 00000100000a6000 l3: fff0000000950000
[50690.939419] l4: 00000100000fc000 l5: fff000000196dc20 l6: fff0000001f07938 l7: 00000000010f6fd0
[50691.053798] i0: fff0000001f07aa8 i1: 0000000000002000 i2: 00000100000a4000 i3: fff0000008311b00
[50691.168170] i4: 0000000000100000 i5: fff0000001448290 i6: fff0000001f07081 i7: 0000000000651cd8
[50691.282546] I7: <unmap_vmas+0xf8/0x1a0>
[50691.332867] Call Trace:
[50691.364891] [<0000000000651cd8>] unmap_vmas+0xf8/0x1a0
[50691.432371] [<000000000065e674>] exit_mmap+0xb4/0x360
[50691.498708] [<00000000004647dc>] __mmput+0x3c/0x120
[50691.562759] [<00000000004648f4>] mmput+0x34/0x60
[50691.623376] [<000000000046b510>] do_exit+0x250/0xa00
[50691.688573] [<000000000046bea4>] do_group_exit+0x24/0xa0
[50691.758340] [<000000000046bf3c>] sys_exit_group+0x1c/0x40
[50691.829256] [<0000000000406174>] linux_sparc_syscall+0x34/0x44
[50691.905886] Caller[0000000000651cd8]: unmap_vmas+0xf8/0x1a0
[50691.979085] Caller[000000000065e674]: exit_mmap+0xb4/0x360
[50692.051141] Caller[00000000004647dc]: __mmput+0x3c/0x120
[50692.120911] Caller[00000000004648f4]: mmput+0x34/0x60
[50692.187246] Caller[000000000046b510]: do_exit+0x250/0xa00
[50692.258160] Caller[000000000046bea4]: do_group_exit+0x24/0xa0
[50692.333645] Caller[000000000046bf3c]: sys_exit_group+0x1c/0x40
[50692.410280] Caller[0000000000406174]: linux_sparc_syscall+0x34/0x44
[50692.492629] Caller[fff0000102ad4a74]: 0xfff0000102ad4a74
[50692.562397] Instruction DUMP:
[50692.562399] ce762010
[50692.601281] 02f47fa8
[50692.632163] c4362018
[50692.663044] <c45c6008>
[50692.693926] 86100011
[50692.724808] 8e08a001
[50692.755689] 8400bfff
[50692.786569] 8779d402
[50692.817451] c458e018
[50692.898656] Fixing recursive fault but reboot is needed!
Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Powered by blists - more mailing lists