lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b0635ad0-7ebf-4152-a69b-58e7e87d5085@roeck-us.net>
Date: Wed, 23 Jul 2025 20:55:14 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Baokun Li <libaokun1@...wei.com>
Cc: linux-ext4@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca,
	jack@...e.cz, linux-kernel@...r.kernel.org, ojaswin@...ux.ibm.com,
	julia.lawall@...ia.fr, yi.zhang@...wei.com, yangerkun@...wei.com,
	libaokun@...weicloud.com
Subject: Re: [PATCH v3 15/17] ext4: convert free groups order lists to xarrays

Hi,

On Mon, Jul 14, 2025 at 09:03:25PM +0800, Baokun Li wrote:
> While traversing the list, holding a spin_lock prevents load_buddy, making
> direct use of ext4_try_lock_group impossible. This can lead to a bouncing
> scenario where spin_is_locked(grp_A) succeeds, but ext4_try_lock_group()
> fails, forcing the list traversal to repeatedly restart from grp_A.
> 

This patch causes crashes for pretty much every architecture when
running unit tests as part of booting.

Example (from x8_64) as well as bisect log attached below.

Guenter

---
...
[    9.353832]         # Subtest: test_new_blocks_simple
[    9.366711] BUG: kernel NULL pointer dereference, address: 0000000000000014
[    9.366931] #PF: supervisor read access in kernel mode
[    9.366993] #PF: error_code(0x0000) - not-present page
[    9.367165] PGD 0 P4D 0
[    9.367305] Oops: Oops: 0000 [#1] SMP PTI
[    9.367686] CPU: 0 UID: 0 PID: 217 Comm: kunit_try_catch Tainted: G                 N  6.16.0-rc7-next-20250722 #1 PREEMPT(voluntary)
[    9.367846] Tainted: [N]=TEST
[    9.367891] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[    9.368063] RIP: 0010:ext4_mb_release+0x26e/0x510
[    9.368374] Code: 28 4a cb ff e8 03 5a cf ff 31 db 48 8d 3c 9b 48 83 c3 01 48 c1 e7 04 48 03 bd 60 05 00 00 e8 c9 a6 48 01 48 8b 85 68 03 00 00 <0f> b6 40 14 83 c0 02 39 d8 7f d6 48 8b bd 60 05 00 00 31 db e8 d9
[    9.368581] RSP: 0000:ffffb33b8041fe40 EFLAGS: 00010286
[    9.368659] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[    9.368732] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff9a319e36
[    9.368802] RBP: ffff8b89c3502400 R08: 0000000000000001 R09: 0000000000000000
[    9.368872] R10: 0000000000000001 R11: 0000000000000120 R12: ffff8b89c2f49160
[    9.368941] R13: ffff8b89c2f49158 R14: ffff8b89c2f24000 R15: ffff8b89c2f24000
[    9.369042] FS:  0000000000000000(0000) GS:ffff8b8a3381a000(0000) knlGS:0000000000000000
[    9.369127] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    9.369194] CR2: 0000000000000014 CR3: 0000000009a9c000 CR4: 00000000003506f0
[    9.369324] Call Trace:
[    9.369440]  <TASK>
[    9.369637]  mbt_kunit_exit+0x47/0xf0
[    9.369745]  ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
[    9.369813]  kunit_try_run_case_cleanup+0x2f/0x40
[    9.369865]  kunit_generic_run_threadfn_adapter+0x1c/0x40
[    9.369922]  kthread+0x10b/0x230
[    9.369965]  ? __pfx_kthread+0x10/0x10
[    9.370013]  ret_from_fork+0x165/0x1b0
[    9.370057]  ? __pfx_kthread+0x10/0x10
[    9.370099]  ret_from_fork_asm+0x1a/0x30
[    9.370188]  </TASK>
[    9.370250] Modules linked in:
[    9.370428] CR2: 0000000000000014
[    9.370657] ---[ end trace 0000000000000000 ]---
[    9.370791] RIP: 0010:ext4_mb_release+0x26e/0x510
[    9.370847] Code: 28 4a cb ff e8 03 5a cf ff 31 db 48 8d 3c 9b 48 83 c3 01 48 c1 e7 04 48 03 bd 60 05 00 00 e8 c9 a6 48 01 48 8b 85 68 03 00 00 <0f> b6 40 14 83 c0 02 39 d8 7f d6 48 8b bd 60 05 00 00 31 db e8 d9
[    9.370996] RSP: 0000:ffffb33b8041fe40 EFLAGS: 00010286
[    9.371050] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[    9.371112] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff9a319e36
[    9.371174] RBP: ffff8b89c3502400 R08: 0000000000000001 R09: 0000000000000000
[    9.371235] R10: 0000000000000001 R11: 0000000000000120 R12: ffff8b89c2f49160
[    9.371297] R13: ffff8b89c2f49158 R14: ffff8b89c2f24000 R15: ffff8b89c2f24000
[    9.371358] FS:  0000000000000000(0000) GS:ffff8b8a3381a000(0000) knlGS:0000000000000000
[    9.371428] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    9.371484] CR2: 0000000000000014 CR3: 0000000009a9c000 CR4: 00000000003506f0
[    9.371598] note: kunit_try_catch[217] exited with irqs disabled
[    9.371861]     # test_new_blocks_simple: try faulted: last line seen fs/ext4/mballoc-test.c:452
[    9.372123]     # test_new_blocks_simple: internal error occurred during test case cleanup: -4
[    9.372440]         not ok 1 block_bits=10 cluster_bits=3 blocks_per_group=8192 group_count=4 desc_size=64
[    9.375702] BUG: kernel NULL pointer dereference, address: 0000000000000014
[    9.375782] #PF: supervisor read access in kernel mode
[    9.375832] #PF: error_code(0x0000) - not-present page
[    9.375881] PGD 0 P4D 0 
[    9.375919] Oops: Oops: 0000 [#2] SMP PTI
[    9.375966] CPU: 0 UID: 0 PID: 219 Comm: kunit_try_catch Tainted: G      D          N  6.16.0-rc7-next-20250722 #1 PREEMPT(voluntary) 
[    9.376085] Tainted: [D]=DIE, [N]=TEST
[    9.376123] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[    9.376220] RIP: 0010:ext4_mb_release+0x26e/0x510
[    9.376275] Code: 28 4a cb ff e8 03 5a cf ff 31 db 48 8d 3c 9b 48 83 c3 01 48 c1 e7 04 48 03 bd 60 05 00 00 e8 c9 a6 48 01 48 8b 85 68 03 00 00 <0f> b6 40 14 83 c0 02 39 d8 7f d6 48 8b bd 60 05 00 00 31 db e8 d9
[    9.376425] RSP: 0000:ffffb33b803f7e40 EFLAGS: 00010286
[    9.376482] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[    9.376546] RDX: 0000000002000008 RSI: ffffffff9a319e36 RDI: ffffffff9a319e36
[    9.376608] RBP: ffff8b89c352a400 R08: 0000000000000000 R09: 0000000000000000
[    9.376669] R10: 0000000000000000 R11: 0000000058d996d7 R12: ffff8b89c2f49cc0
[    9.376730] R13: ffff8b89c2f49cb8 R14: ffff8b89c3524000 R15: ffff8b89c3524000
[    9.376792] FS:  0000000000000000(0000) GS:ffff8b8a3381a000(0000) knlGS:0000000000000000
[    9.376861] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    9.376913] CR2: 0000000000000014 CR3: 0000000009a9c000 CR4: 00000000003506f0
[    9.376975] Call Trace:
[    9.377004]  <TASK>
[    9.377040]  mbt_kunit_exit+0x47/0xf0
[    9.377089]  ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
[    9.377150]  kunit_try_run_case_cleanup+0x2f/0x40
[    9.377207]  kunit_generic_run_threadfn_adapter+0x1c/0x40
[    9.377266]  kthread+0x10b/0x230
[    9.377308]  ? __pfx_kthread+0x10/0x10
[    9.377353]  ret_from_fork+0x165/0x1b0
[    9.377397]  ? __pfx_kthread+0x10/0x10
[    9.377439]  ret_from_fork_asm+0x1a/0x30
[    9.377505]  </TASK>
[    9.377531] Modules linked in:
[    9.377571] CR2: 0000000000000014
[    9.377609] ---[ end trace 0000000000000000 ]---

---
Bisect log:

# bad: [a933d3dc1968fcfb0ab72879ec304b1971ed1b9a] Add linux-next specific files for 20250723
# good: [89be9a83ccf1f88522317ce02f854f30d6115c41] Linux 6.16-rc7
git bisect start 'HEAD' 'v6.16-rc7'
# bad: [a56f8f8967ad980d45049973561b89dcd9e37e5d] Merge branch 'main' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
git bisect bad a56f8f8967ad980d45049973561b89dcd9e37e5d
# bad: [f6a8dede4030970707e9bae5b3ae76f60df4b75a] Merge branch 'fs-next' of linux-next
git bisect bad f6a8dede4030970707e9bae5b3ae76f60df4b75a
# good: [b863560c5a26fbcf164f5759c98bb5e72e26848d] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git
git bisect good b863560c5a26fbcf164f5759c98bb5e72e26848d
# bad: [690056682cc4de56d8de794bc06a3c04bc7f624b] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs.git
git bisect bad 690056682cc4de56d8de794bc06a3c04bc7f624b
# good: [fea76c3eb7455d1e941fba6fdd89ab41ab7797c8] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git
git bisect good fea76c3eb7455d1e941fba6fdd89ab41ab7797c8
# bad: [714a183e8cf1cc1ddddb3318de1694a33f49c694] Merge branch 'dev' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git
git bisect bad 714a183e8cf1cc1ddddb3318de1694a33f49c694
# good: [5fb60c0365c4dad347e4958f78976cb733d903f2] f2fs: Pass a folio to __has_merged_page()
git bisect good 5fb60c0365c4dad347e4958f78976cb733d903f2
# bad: [a8a47fa84cc2168b2b3bd645c2c0918eed994fc0] ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr
git bisect bad a8a47fa84cc2168b2b3bd645c2c0918eed994fc0
# good: [a35454ecf8a320c49954fdcdae0e8d3323067632] ext4: use memcpy() instead of strcpy()
git bisect good a35454ecf8a320c49954fdcdae0e8d3323067632
# good: [3772fe7b4225f21a1bfe63e4a338702cc3c153de] ext4: convert sbi->s_mb_free_pending to atomic_t
git bisect good 3772fe7b4225f21a1bfe63e4a338702cc3c153de
# good: [12a5b877c314778ddf9a5c603eeb1803a514ab58] ext4: factor out ext4_mb_might_prefetch()
git bisect good 12a5b877c314778ddf9a5c603eeb1803a514ab58
# bad: [458bfb991155c2e8ba51861d1ef3c81c5a0846f9] ext4: convert free groups order lists to xarrays
git bisect bad 458bfb991155c2e8ba51861d1ef3c81c5a0846f9
# good: [6e0275f6e713f55dd3fc23be317ec11f8db1766d] ext4: factor out ext4_mb_scan_group()
git bisect good 6e0275f6e713f55dd3fc23be317ec11f8db1766d
# first bad commit: [458bfb991155c2e8ba51861d1ef3c81c5a0846f9] ext4: convert free groups order lists to xarrays

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ