[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b0635ad0-7ebf-4152-a69b-58e7e87d5085@roeck-us.net>
Date: Wed, 23 Jul 2025 20:55:14 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Baokun Li <libaokun1@...wei.com>
Cc: linux-ext4@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca,
jack@...e.cz, linux-kernel@...r.kernel.org, ojaswin@...ux.ibm.com,
julia.lawall@...ia.fr, yi.zhang@...wei.com, yangerkun@...wei.com,
libaokun@...weicloud.com
Subject: Re: [PATCH v3 15/17] ext4: convert free groups order lists to xarrays
Hi,
On Mon, Jul 14, 2025 at 09:03:25PM +0800, Baokun Li wrote:
> While traversing the list, holding a spin_lock prevents load_buddy, making
> direct use of ext4_try_lock_group impossible. This can lead to a bouncing
> scenario where spin_is_locked(grp_A) succeeds, but ext4_try_lock_group()
> fails, forcing the list traversal to repeatedly restart from grp_A.
>
This patch causes crashes for pretty much every architecture when
running unit tests as part of booting.
Example (from x8_64) as well as bisect log attached below.
Guenter
---
...
[ 9.353832] # Subtest: test_new_blocks_simple
[ 9.366711] BUG: kernel NULL pointer dereference, address: 0000000000000014
[ 9.366931] #PF: supervisor read access in kernel mode
[ 9.366993] #PF: error_code(0x0000) - not-present page
[ 9.367165] PGD 0 P4D 0
[ 9.367305] Oops: Oops: 0000 [#1] SMP PTI
[ 9.367686] CPU: 0 UID: 0 PID: 217 Comm: kunit_try_catch Tainted: G N 6.16.0-rc7-next-20250722 #1 PREEMPT(voluntary)
[ 9.367846] Tainted: [N]=TEST
[ 9.367891] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 9.368063] RIP: 0010:ext4_mb_release+0x26e/0x510
[ 9.368374] Code: 28 4a cb ff e8 03 5a cf ff 31 db 48 8d 3c 9b 48 83 c3 01 48 c1 e7 04 48 03 bd 60 05 00 00 e8 c9 a6 48 01 48 8b 85 68 03 00 00 <0f> b6 40 14 83 c0 02 39 d8 7f d6 48 8b bd 60 05 00 00 31 db e8 d9
[ 9.368581] RSP: 0000:ffffb33b8041fe40 EFLAGS: 00010286
[ 9.368659] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[ 9.368732] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff9a319e36
[ 9.368802] RBP: ffff8b89c3502400 R08: 0000000000000001 R09: 0000000000000000
[ 9.368872] R10: 0000000000000001 R11: 0000000000000120 R12: ffff8b89c2f49160
[ 9.368941] R13: ffff8b89c2f49158 R14: ffff8b89c2f24000 R15: ffff8b89c2f24000
[ 9.369042] FS: 0000000000000000(0000) GS:ffff8b8a3381a000(0000) knlGS:0000000000000000
[ 9.369127] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9.369194] CR2: 0000000000000014 CR3: 0000000009a9c000 CR4: 00000000003506f0
[ 9.369324] Call Trace:
[ 9.369440] <TASK>
[ 9.369637] mbt_kunit_exit+0x47/0xf0
[ 9.369745] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
[ 9.369813] kunit_try_run_case_cleanup+0x2f/0x40
[ 9.369865] kunit_generic_run_threadfn_adapter+0x1c/0x40
[ 9.369922] kthread+0x10b/0x230
[ 9.369965] ? __pfx_kthread+0x10/0x10
[ 9.370013] ret_from_fork+0x165/0x1b0
[ 9.370057] ? __pfx_kthread+0x10/0x10
[ 9.370099] ret_from_fork_asm+0x1a/0x30
[ 9.370188] </TASK>
[ 9.370250] Modules linked in:
[ 9.370428] CR2: 0000000000000014
[ 9.370657] ---[ end trace 0000000000000000 ]---
[ 9.370791] RIP: 0010:ext4_mb_release+0x26e/0x510
[ 9.370847] Code: 28 4a cb ff e8 03 5a cf ff 31 db 48 8d 3c 9b 48 83 c3 01 48 c1 e7 04 48 03 bd 60 05 00 00 e8 c9 a6 48 01 48 8b 85 68 03 00 00 <0f> b6 40 14 83 c0 02 39 d8 7f d6 48 8b bd 60 05 00 00 31 db e8 d9
[ 9.370996] RSP: 0000:ffffb33b8041fe40 EFLAGS: 00010286
[ 9.371050] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[ 9.371112] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff9a319e36
[ 9.371174] RBP: ffff8b89c3502400 R08: 0000000000000001 R09: 0000000000000000
[ 9.371235] R10: 0000000000000001 R11: 0000000000000120 R12: ffff8b89c2f49160
[ 9.371297] R13: ffff8b89c2f49158 R14: ffff8b89c2f24000 R15: ffff8b89c2f24000
[ 9.371358] FS: 0000000000000000(0000) GS:ffff8b8a3381a000(0000) knlGS:0000000000000000
[ 9.371428] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9.371484] CR2: 0000000000000014 CR3: 0000000009a9c000 CR4: 00000000003506f0
[ 9.371598] note: kunit_try_catch[217] exited with irqs disabled
[ 9.371861] # test_new_blocks_simple: try faulted: last line seen fs/ext4/mballoc-test.c:452
[ 9.372123] # test_new_blocks_simple: internal error occurred during test case cleanup: -4
[ 9.372440] not ok 1 block_bits=10 cluster_bits=3 blocks_per_group=8192 group_count=4 desc_size=64
[ 9.375702] BUG: kernel NULL pointer dereference, address: 0000000000000014
[ 9.375782] #PF: supervisor read access in kernel mode
[ 9.375832] #PF: error_code(0x0000) - not-present page
[ 9.375881] PGD 0 P4D 0
[ 9.375919] Oops: Oops: 0000 [#2] SMP PTI
[ 9.375966] CPU: 0 UID: 0 PID: 219 Comm: kunit_try_catch Tainted: G D N 6.16.0-rc7-next-20250722 #1 PREEMPT(voluntary)
[ 9.376085] Tainted: [D]=DIE, [N]=TEST
[ 9.376123] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 9.376220] RIP: 0010:ext4_mb_release+0x26e/0x510
[ 9.376275] Code: 28 4a cb ff e8 03 5a cf ff 31 db 48 8d 3c 9b 48 83 c3 01 48 c1 e7 04 48 03 bd 60 05 00 00 e8 c9 a6 48 01 48 8b 85 68 03 00 00 <0f> b6 40 14 83 c0 02 39 d8 7f d6 48 8b bd 60 05 00 00 31 db e8 d9
[ 9.376425] RSP: 0000:ffffb33b803f7e40 EFLAGS: 00010286
[ 9.376482] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[ 9.376546] RDX: 0000000002000008 RSI: ffffffff9a319e36 RDI: ffffffff9a319e36
[ 9.376608] RBP: ffff8b89c352a400 R08: 0000000000000000 R09: 0000000000000000
[ 9.376669] R10: 0000000000000000 R11: 0000000058d996d7 R12: ffff8b89c2f49cc0
[ 9.376730] R13: ffff8b89c2f49cb8 R14: ffff8b89c3524000 R15: ffff8b89c3524000
[ 9.376792] FS: 0000000000000000(0000) GS:ffff8b8a3381a000(0000) knlGS:0000000000000000
[ 9.376861] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9.376913] CR2: 0000000000000014 CR3: 0000000009a9c000 CR4: 00000000003506f0
[ 9.376975] Call Trace:
[ 9.377004] <TASK>
[ 9.377040] mbt_kunit_exit+0x47/0xf0
[ 9.377089] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
[ 9.377150] kunit_try_run_case_cleanup+0x2f/0x40
[ 9.377207] kunit_generic_run_threadfn_adapter+0x1c/0x40
[ 9.377266] kthread+0x10b/0x230
[ 9.377308] ? __pfx_kthread+0x10/0x10
[ 9.377353] ret_from_fork+0x165/0x1b0
[ 9.377397] ? __pfx_kthread+0x10/0x10
[ 9.377439] ret_from_fork_asm+0x1a/0x30
[ 9.377505] </TASK>
[ 9.377531] Modules linked in:
[ 9.377571] CR2: 0000000000000014
[ 9.377609] ---[ end trace 0000000000000000 ]---
---
Bisect log:
# bad: [a933d3dc1968fcfb0ab72879ec304b1971ed1b9a] Add linux-next specific files for 20250723
# good: [89be9a83ccf1f88522317ce02f854f30d6115c41] Linux 6.16-rc7
git bisect start 'HEAD' 'v6.16-rc7'
# bad: [a56f8f8967ad980d45049973561b89dcd9e37e5d] Merge branch 'main' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
git bisect bad a56f8f8967ad980d45049973561b89dcd9e37e5d
# bad: [f6a8dede4030970707e9bae5b3ae76f60df4b75a] Merge branch 'fs-next' of linux-next
git bisect bad f6a8dede4030970707e9bae5b3ae76f60df4b75a
# good: [b863560c5a26fbcf164f5759c98bb5e72e26848d] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git
git bisect good b863560c5a26fbcf164f5759c98bb5e72e26848d
# bad: [690056682cc4de56d8de794bc06a3c04bc7f624b] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs.git
git bisect bad 690056682cc4de56d8de794bc06a3c04bc7f624b
# good: [fea76c3eb7455d1e941fba6fdd89ab41ab7797c8] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git
git bisect good fea76c3eb7455d1e941fba6fdd89ab41ab7797c8
# bad: [714a183e8cf1cc1ddddb3318de1694a33f49c694] Merge branch 'dev' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git
git bisect bad 714a183e8cf1cc1ddddb3318de1694a33f49c694
# good: [5fb60c0365c4dad347e4958f78976cb733d903f2] f2fs: Pass a folio to __has_merged_page()
git bisect good 5fb60c0365c4dad347e4958f78976cb733d903f2
# bad: [a8a47fa84cc2168b2b3bd645c2c0918eed994fc0] ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr
git bisect bad a8a47fa84cc2168b2b3bd645c2c0918eed994fc0
# good: [a35454ecf8a320c49954fdcdae0e8d3323067632] ext4: use memcpy() instead of strcpy()
git bisect good a35454ecf8a320c49954fdcdae0e8d3323067632
# good: [3772fe7b4225f21a1bfe63e4a338702cc3c153de] ext4: convert sbi->s_mb_free_pending to atomic_t
git bisect good 3772fe7b4225f21a1bfe63e4a338702cc3c153de
# good: [12a5b877c314778ddf9a5c603eeb1803a514ab58] ext4: factor out ext4_mb_might_prefetch()
git bisect good 12a5b877c314778ddf9a5c603eeb1803a514ab58
# bad: [458bfb991155c2e8ba51861d1ef3c81c5a0846f9] ext4: convert free groups order lists to xarrays
git bisect bad 458bfb991155c2e8ba51861d1ef3c81c5a0846f9
# good: [6e0275f6e713f55dd3fc23be317ec11f8db1766d] ext4: factor out ext4_mb_scan_group()
git bisect good 6e0275f6e713f55dd3fc23be317ec11f8db1766d
# first bad commit: [458bfb991155c2e8ba51861d1ef3c81c5a0846f9] ext4: convert free groups order lists to xarrays
Powered by blists - more mailing lists