lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2689df25d95e7c4fab781be2b3a4ac7ff9b50132.camel@physik.fu-berlin.de>
Date: Mon, 04 Aug 2025 09:48:50 +0200
From: John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>
To: Anthony Yznaga <anthony.yznaga@...cle.com>, "Matthew Wilcox (Oracle)"
	 <willy@...radead.org>, linux-arch@...r.kernel.org
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org, "David S. Miller"	
 <davem@...emloft.net>, sparclinux@...r.kernel.org, Andreas Larsson	
 <andreas@...sler.com>, Rod Schnell <rods@...radio.com>, Sam James
 <sam@...too.org>
Subject: Re: [PATCH v4 25/36] sparc64: Implement the new page table range API

Hi,

On Mon, 2025-08-04 at 08:58 +0200, John Paul Adrian Glaubitz wrote:
> On Mon, 2025-08-04 at 07:36 +0200, John Paul Adrian Glaubitz wrote:
> > On Mon, 2025-08-04 at 07:12 +0200, John Paul Adrian Glaubitz wrote:
> > > On Sun, 2025-08-03 at 12:08 -0700, Anthony Yznaga wrote:
> > > > There was a follow-on fix that addressed a bug with this patch:
> > > > 
> > > > f4b4f3ec1a31 sparc64: add missing initialization of folio in tlb_batch_add()
> > > 
> > > Indeed I just tried v6.6 which has this patch and added your sun4u fix and it
> > > seems to be stable. I was sure I saw problems even with v6.16 though.
> > > 
> > > Let me run more tests.
> > 
> > I'm seeing another crash on v6.16 on sun4u even with your patch applied:
> > 
> > [  456.443492] kernel BUG at fs/ext4/inode.c:1174!
> > [  456.503059]               \|/ ____ \|/
> > [  456.503059]               "@'/ .. \`@"
> > [  456.503059]               /_| \__/ |_\
> > [  456.503059]                  \__U_/
> > [  456.696513] apt-get(1217): Kernel bad sw trap 5 [#1]
> > [  456.761698] CPU: 0 UID: 0 PID: 1217 Comm: apt-get Not tainted 6.16.0+ #24 VOLUNTARY 
> > [  456.863502] TSTATE: 0000000011001601 TPC: 0000000010309250 TNPC: 0000000010309254 Y: 00000000    Not tainted
> > [  456.992850] TPC: <ext4_block_write_begin+0x450/0x540 [ext4]>
> > [  457.067500] g0: 0000000000000000 g1: 0000000000000001 g2: 0000000000000000 g3: 0000000000000000
> > [  457.181869] g4: fff00000141d5c80 g5: 0000000000000008 g6: fff000000be24000 g7: 0000000000000001
> > [  457.296245] o0: 00000000103944b0 o1: 0000000000000496 o2: ffffffffffffffbf o3: 0000000000101cca
> > [  457.410618] o4: 0000000000000000 o5: 0000000000000000 sp: fff000000be26fd1 ret_pc: 0000000010309248
> > [  457.529571] RPC: <ext4_block_write_begin+0x448/0x540 [ext4]>
> > [  457.604020] l0: fff000003def26e0 l1: 0000000000113cca l2: fff000003def2578 l3: 0000000000000002
> > [  457.718394] l4: 0000000000000000 l5: 0000000000080000 l6: 0000000000012000 l7: 0000000000000001
> > [  457.832770] i0: 0000000000000000 i1: 000c00000026b500 i2: 0000000000001000 i3: 0000000000082000
> > [  457.947146] i4: 00000000103034a0 i5: 0000000000000000 i6: fff000000be270c1 i7: 000000001030c8dc
> > [  458.061528] I7: <ext4_da_write_begin+0x1bc/0x340 [ext4]>
> > [  458.131389] Call Trace:
> > [  458.163408] [<000000001030c8dc>] ext4_da_write_begin+0x1bc/0x340 [ext4]
> > [  458.250447] [<0000000000674230>] generic_perform_write+0x90/0x240
> > [  458.330606] [<00000000102f50b4>] ext4_buffered_write_iter+0x54/0x120 [ext4]
> > [  458.422214] [<00000000102f5624>] ext4_file_write_iter+0x3e4/0x780 [ext4]
> > [  458.510388] [<0000000000749cc4>] vfs_write+0x2c4/0x3e0
> > [  458.577867] [<0000000000749f4c>] ksys_write+0x4c/0xe0
> > [  458.644203] [<0000000000749ff4>] sys_write+0x14/0x40
> > [  458.709397] [<0000000000406174>] linux_sparc_syscall+0x34/0x44
> > [  458.786048] Disabling lock debugging due to kernel taint
> > [  458.855904] Caller[000000001030c8dc]: ext4_da_write_begin+0x1bc/0x340 [ext4]
> > [  458.948653] Caller[0000000000674230]: generic_perform_write+0x90/0x240
> > [  459.034430] Caller[00000000102f50b4]: ext4_buffered_write_iter+0x54/0x120 [ext4]
> > [  459.131761] Caller[00000000102f5624]: ext4_file_write_iter+0x3e4/0x780 [ext4]
> > [  459.225648] Caller[0000000000749cc4]: vfs_write+0x2c4/0x3e0
> > [  459.298846] Caller[0000000000749f4c]: ksys_write+0x4c/0xe0
> > [  459.370900] Caller[0000000000749ff4]: sys_write+0x14/0x40
> > [  459.441810] Caller[0000000000406174]: linux_sparc_syscall+0x34/0x44
> > [  459.524168] Caller[0000000000000000]: 0x0
> > [  459.576772] Instruction DUMP:
> > [  459.576776]  11040e51 
> > [  459.615662]  7c04b816 
> > [  459.646541]  901220b0 
> > [  459.677418] <91d02005>
> > [  459.708302]  9735a000 
> > [  459.739181]  95352000 
> > [  459.770076]  d25fa7cf 
> > [  459.800945]  7fffe818 
> > [  459.831825]  90100019 
> > [  459.862706] 
> > [  459.941500] systemd[1]: Failed to open /dev/pts device, ignoring: Inappropriate ioctl for device
> > [  460.063831] systemd[1]: rsyslog.service: Main process exited, code=killed, status=6/ABRT
> > [  460.170962] systemd[1]: rsyslog.service: Failed with result 'signal'.
> > [  460.267153] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 1.
> > [  460.388605] systemd[1]: rsyslog.service: Scheduled restart job, restart counter is at 1.
> > [  460.517346] systemd[1]: Starting rsyslog.service - System Logging Service...
> > [  460.618299] systemd[1]: Starting systemd-journald.service - Journal Service...
> > [  460.895645] systemd-journald[1237]: Collecting audit messages is disabled.
> > [  461.048068] systemd[1]: Failed to open /dev/pts device, ignoring: Inappropriate ioctl for device
> > [  461.202783] systemd-journald[1237]: File /var/log/journal/9ac90e257b3e423284cfc21a00cbeeb8/system.journal corrupted or uncleanly shut down, renaming and replacing.
> > [  461.456867] systemd[1]: Started rsyslog.service - System Logging Service.
> > [  461.616651] systemd-journald[1237]: Time jumped backwards, rotating.
> > [  461.773305] systemd-journald[1237]: Failed to read journal file /var/log/journal/9ac90e257b3e423284cfc21a00cbeeb8/user-1002.journal for rotation, trying to move it out of the way: Device or
> > resource busy
> > [  462.065725] systemd[1]: Started systemd-journald.service - Journal Service.
> > [  462.159895] systemd-journald[1237]: Time jumped backwards, rotating.
> > [  519.719624] kernel BUG at fs/ext4/inode.c:1174!
> > [  519.779143]               \|/ ____ \|/
> > [  519.779143]               "@'/ .. \`@"
> > [  519.779143]               /_| \__/ |_\
> > [  519.779143]                  \__U_/
> > [  519.972586] apt(1249): Kernel bad sw trap 5 [#2]
> > [  520.033239] CPU: 0 UID: 0 PID: 1249 Comm: apt Tainted: G      D             6.16.0+ #24 VOLUNTARY 
> > [  520.151048] Tainted: [D]=DIE
> > [  520.188797] TSTATE: 0000000011001603 TPC: 0000000010309250 TNPC: 0000000010309254 Y: 00000000    Tainted: G      D            
> > [  520.338725] TPC: <ext4_block_write_begin+0x450/0x540 [ext4]>
> > [  520.413282] g0: 0000000000000000 g1: 0000000000000001 g2: 0000000000000000 g3: 0000000000000000
> > [  520.527655] g4: fff00000141d40c0 g5: 000000000000000b g6: fff000000a818000 g7: 0000000000000001
> > [  520.642031] o0: 00000000103944b0 o1: 0000000000000496 o2: fffffffffffffcc0 o3: 0000000000101cca
> > [  520.756408] o4: 0000000000000004 o5: 0000000000000000 sp: fff000000a81afd1 ret_pc: 0000000010309248
> > [  520.875350] RPC: <ext4_block_write_begin+0x448/0x540 [ext4]>
> > [  520.949799] l0: fff000023439af00 l1: 0000000000113cca l2: fff000023439ad98 l3: 0000000000000002
> > [  521.064174] l4: 0000000000000000 l5: 0000000000080000 l6: 0000000000012000 l7: 0000000000000001
> > [  521.178547] i0: 0000000000000000 i1: 000c000000164a00 i2: 0000000000001fc0 i3: 0000000000680000
> > [  521.292923] i4: 00000000103034a0 i5: 0000000000000000 i6: fff000000a81b0c1 i7: 000000001030c8dc
> > [  521.407297] I7: <ext4_da_write_begin+0x1bc/0x340 [ext4]>
> > [  521.477195] Call Trace:
> > [  521.509295] [<000000001030c8dc>] ext4_da_write_begin+0x1bc/0x340 [ext4]
> > [  521.596330] [<0000000000674230>] generic_perform_write+0x90/0x240
> > [  521.676495] [<00000000102f50b4>] ext4_buffered_write_iter+0x54/0x120 [ext4]
> > [  521.768196] [<00000000102f5624>] ext4_file_write_iter+0x3e4/0x780 [ext4]
> > [  521.856381] [<0000000000749cc4>] vfs_write+0x2c4/0x3e0
> > [  521.923957] [<0000000000749f4c>] ksys_write+0x4c/0xe0
> > [  521.990294] [<0000000000749ff4>] sys_write+0x14/0x40
> > [  522.055486] [<0000000000406174>] linux_sparc_syscall+0x34/0x44
> > [  522.132122] Caller[000000001030c8dc]: ext4_da_write_begin+0x1bc/0x340 [ext4]
> > [  522.224873] Caller[0000000000674230]: generic_perform_write+0x90/0x240
> > [  522.310649] Caller[00000000102f50b4]: ext4_buffered_write_iter+0x54/0x120 [ext4]
> > [  522.407974] Caller[00000000102f5624]: ext4_file_write_iter+0x3e4/0x780 [ext4]
> > [  522.501864] Caller[0000000000749cc4]: vfs_write+0x2c4/0x3e0
> > [  522.575062] Caller[0000000000749f4c]: ksys_write+0x4c/0xe0
> > [  522.647118] Caller[0000000000749ff4]: sys_write+0x14/0x40
> > [  522.718031] Caller[0000000000406174]: linux_sparc_syscall+0x34/0x44
> > [  522.800380] Caller[0000000000000000]: 0x0
> > [  522.852991] Instruction DUMP:
> > [  522.852994]  11040e51 
> > [  522.891878]  7c04b816 
> > [  522.922760]  901220b0 
> > [  522.953638] <91d02005>
> > [  522.984521]  9735a000 
> > [  523.015401]  95352000 
> > [  523.046284]  d25fa7cf 
> > [  523.077163]  7fffe818 
> > [  523.108109]  90100019 
> > [  523.139044]
> > 
> > I'll try to bisect this one later this week.
> 
> OK, so v6.8 is fine while v6.9 crashes:
> 
> [   39.788224] Unable to handle kernel NULL pointer dereference
> [   39.862657] tsk->{mm,active_mm}->context = 000000000000004b
> [   39.935941] tsk->{mm,active_mm}->pgd = fff000000aa98000
> [   40.004566]               \|/ ____ \|/
> [   40.004566]               "@'/ .. \`@"
> [   40.004566]               /_| \__/ |_\
> [   40.004566]                  \__U_/
> [   40.197871] (udev-worker)(88): Oops [#1]
> [   40.249329] CPU: 0 PID: 88 Comm: (udev-worker) Tainted: P           O       6.9.0+ #28
> [   40.353415] TSTATE: 0000004411001605 TPC: 0000000000df092c TNPC: 0000000000df0930 Y: 00000000    Tainted: P           O      
> [   40.502105] TPC: <strlen+0x60/0xd4>
> [   40.547844] g0: fff000000a3171a1 g1: 0000000000000000 g2: 0000000000000000 g3: 0000000000000001
> [   40.662224] g4: fff000000aa4dac0 g5: 0000000010000233 g6: fff000000a314000 g7: 0000000000000000
> [   40.776599] o0: 0000000000000010 o1: 0000000000000010 o2: 0000000001010101 o3: 0000000080808080
> [   40.890974] o4: 0000000001010000 o5: 0000000000000000 sp: fff000000a317201 ret_pc: 00000000004d4b08
> [   41.009924] RPC: <module_patient_check_exists.constprop.0+0x48/0x1e0>
> [   41.094557] l0: fff0000100032f40 l1: 0000000000000000 l2: 0000000000000000 l3: 0000000000000000
> [   41.208936] l4: 0000000000000000 l5: 0000000000000000 l6: 0000000000000000 l7: 0000000000000000
> [   41.323311] i0: 00000001000256d8 i1: 0000000001143000 i2: 0000000001143300 i3: 000000000000000b
> [   41.437686] i4: 0000000000000010 i5: fffffffffffffff8 i6: fff000000a3172e1 i7: 00000000004d63f0
> [   41.552062] I7: <load_module+0x550/0x1f00>
> [   41.605811] Call Trace:
> [   41.637838] [<00000000004d63f0>] load_module+0x550/0x1f00
> [   41.708752] [<00000000004d7fac>] init_module_from_file+0x6c/0xa0
> [   41.787670] [<00000000004d81c8>] sys_finit_module+0x188/0x280
> [   41.863158] [<0000000000406174>] linux_sparc_syscall+0x34/0x44
> [   41.939790] Caller[00000000004d63f0]: load_module+0x550/0x1f00
> [   42.016423] Caller[00000000004d7fac]: init_module_from_file+0x6c/0xa0
> [   42.101059] Caller[00000000004d81c8]: sys_finit_module+0x188/0x280
> [   42.182266] Caller[0000000000406174]: linux_sparc_syscall+0x34/0x44
> [   42.264614] Caller[fff000010480e2fc]: 0xfff000010480e2fc
> [   42.334384] Instruction DUMP:
> [   42.334387]  96132080 
> [   42.373269]  19004040 
> [   42.404151]  94132101 
> [   42.435030] <da020000>
> [   42.465914]  9823400a 
> [   42.496793]  808b000b 
> [   42.527674]  024ffffd 
> [   42.558556]  90022004 
> [   42.589437]  8f336018 
> [   42.620318]
> 
> So, the regression was introduced with v6.9. Will bisect this later this week.

Hmm, I just ran into another crash on v6.8. The machine didn't crash though:

[  489.263666] Unable to handle kernel paging request at virtual address 000c000002400000
[  489.367912] tsk->{mm,active_mm}->context = 00000000000013b2
[  489.441150] tsk->{mm,active_mm}->pgd = fff000000af04000
[  489.509872]               \|/ ____ \|/
                             "@'/ .. \`@"
                             /_| \__/ |_\
                                \__U_/
[  489.703156] sshd-session(3671): Oops [#1]
[  489.755758] CPU: 0 PID: 3671 Comm: sshd-session Not tainted 6.8.0+ #27
[  489.841544] TSTATE: 0000000811001600 TPC: 000000000065d620 TNPC: 000000000065d624 Y: 00000000    Not tainted
[  489.970796] TPC: <unmap_page_range+0x620/0xc60>
[  490.030362] g0: fff000000a939360 g1: 0000000000008800 g2: ffffffffffffffff g3: ffffffffffffffff
[  490.144748] g4: fff0000000d4a100 g5: 0000000002ad4a68 g6: fff000000a6dc000 g7: 0000010000000000
[  490.259118] o0: 000c0000024005a0 o1: fff00001018a4000 o2: 0000000100028290 o3: 0000000100028290
[  490.373493] o4: fff0000001afe71c o5: 0000000001099c00 sp: fff000000a6deeb1 ret_pc: 000000000065d53c
[  490.492447] RPC: <unmap_page_range+0x53c/0xc60>
[  490.551915] l0: fff00001018f4000 l1: 0000000100028290 l2: fff000000a6df968 l3: fff00001018a4000
[  490.666292] l4: fff00000070a25a0 l5: fff000000a6dfaa8 l6: 0000000000000001 l7: 00000000011605a8
[  490.780668] i0: fff000000a9b0900 i1: fff00001018a6000 i2: fff0000000f99018 i3: fff0000004308290
[  490.895045] i4: fff00001018f4000 i5: 000c0000024005a0 i6: fff000000a6deff1 i7: 000000000065dcd8
[  491.009418] I7: <unmap_single_vma.constprop.0+0x78/0xe0>
[  491.079183] Call Trace:
[  491.111206] [<000000000065dcd8>] unmap_single_vma.constprop.0+0x78/0xe0
[  491.198136] [<000000000065dd9c>] unmap_vmas+0x5c/0x1a0
[  491.265615] [<000000000066a2a4>] exit_mmap+0xc4/0x440
[  491.331950] [<0000000000463d44>] __mmput+0x44/0x140
[  491.396003] [<0000000000463e74>] mmput+0x34/0x60
[  491.456618] [<000000000046a444>] do_exit+0x284/0xaa0
[  491.521816] [<000000000046ae24>] do_group_exit+0x24/0xa0
[  491.591584] [<000000000046aebc>] sys_exit_group+0x1c/0x40
[  491.662496] [<0000000000406174>] linux_sparc_syscall+0x34/0x44
[  491.739127] Disabling lock debugging due to kernel taint
[  491.808896] Caller[000000000065dcd8]: unmap_single_vma.constprop.0+0x78/0xe0
[  491.901543] Caller[000000000065dd9c]: unmap_vmas+0x5c/0x1a0
[  491.974740] Caller[000000000066a2a4]: exit_mmap+0xc4/0x440
[  492.046795] Caller[0000000000463d44]: __mmput+0x44/0x140
[  492.116564] Caller[0000000000463e74]: mmput+0x34/0x60
[  492.182901] Caller[000000000046a444]: do_exit+0x284/0xaa0
[  492.253815] Caller[000000000046ae24]: do_group_exit+0x24/0xa0
[  492.329301] Caller[000000000046aebc]: sys_exit_group+0x1c/0x40
[  492.405935] Caller[0000000000406174]: linux_sparc_syscall+0x34/0x44
[  492.488282] Caller[fff0000102ad4a74]: 0xfff0000102ad4a74
[  492.558054] Instruction DUMP:
[  492.558057]  c6756010 
[  492.596937]  02ff7fd9 
[  492.627825]  c2356018 
[  492.658702] <c25f6008>
[  492.689581]  8610001d 
[  492.720460]  84086001 
[  492.751343]  82007fff 
[  492.782223]  87789401 
[  492.813105]  c258e018 

[  492.894311] Fixing recursive fault but reboot is needed!

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ