[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2d59bf3d-212c-418f-97ac-2157ab1c2628@huawei.com>
Date: Sat, 26 Jul 2025 08:50:10 +0800
From: Baokun Li <libaokun1@...wei.com>
To: Zhang Yi <yi.zhang@...weicloud.com>, Theodore Ts'o <tytso@....edu>
CC: Guenter Roeck <linux@...ck-us.net>, <linux-ext4@...r.kernel.org>,
<adilger.kernel@...ger.ca>, <jack@...e.cz>, <linux-kernel@...r.kernel.org>,
<ojaswin@...ux.ibm.com>, <julia.lawall@...ia.fr>, <yangerkun@...wei.com>,
<libaokun@...weicloud.com>
Subject: Re: [PATCH v3 15/17] ext4: convert free groups order lists to xarrays
On 7/25/2025 10:28 AM, Zhang Yi wrote:
> On 2025/7/24 22:54, Theodore Ts'o wrote:
>> On Thu, Jul 24, 2025 at 07:14:58PM +0800, Zhang Yi wrote:
>>> I'm sorry for this regression, we didn't run these tests.
>> No worries, I didn't run them either.
>>
>>> Could you please try the following diff? I have tested it on my
>>> machine, and the issue does not recur. If every thing looks fine, I
>>> will send out the official patch.
>> This patch fixes the test bug which was causing the failure of
>> test_new_blocks_simple.
>>
> The official patch to fix test_new_blocks_simple for the next
> branch:
>
> https://lore.kernel.org/linux-ext4/20250725021550.3177573-1-yi.zhang@huaweicloud.com/
>
>> However, there is still test failure of test_mb_mark_used in the patch
>> series starting with bbe11dd13a3f ("ext4: fix largest free orders
>> lists corruption on mb_optimize_scan switch"). The test failure is
>> fixed by 458bfb991155 ("ext4: convert free groups order lists to
>> xarrays"). The reason why this is especialy problematic is that
>> commit which introduced the problem is marked as "cc: stable", which
>> means it will get back ported to LTS kernels, thus introducing a
>> potential bug.
>>
> Indeed!
>
>> One of the advantages of unit tests is that they are light weight
>> enough that it is tractable to run them against every commit in the
>> patch series. So we should strive to add more unit tests, since it
>> makes easier to detect regressions.
>>
>> Anyway, here's the stack trace staring with "ext4: fix largest free
>> orders lists corruption on mb_optimize_scan switch". Could you
>> investigate this failure? Many thanks!!
>>
> Sure! I've sent out the fix that applies to the kernel that has only
> merged bbe11dd13a3f ("ext4: fix largest free orders lists corruption
> on mb_optimize_scan switch"), but not merged 458bfb991155 ("ext4:
> convert free groups order lists to xarrays"). Please give it a try.
>
> https://lore.kernel.org/linux-ext4/20250725021654.3188798-1-yi.zhang@huaweicloud.com/
>
Sorry for the late reply, I haven't had time to look into this this week.
I really appreciate Yi for taking the time to help address these issues.
I'm also very sorry for introducing a regression in the ext4 kunit tests.
Thanks,
Baokun
>
>> [09:35:46] ==================== test_mb_mark_used ====================
>> [09:35:46] [ERROR] Test: test_mb_mark_used: missing subtest result line!
>> [09:35:46]
>> [09:35:46] Pid: 35, comm: kunit_try_catch Tainted: G W N 6.16.0-rc4-00031-gbbe11dd13a3f-dirty
>> [09:35:46] RIP: 0033:mb_set_largest_free_order+0x5c/0xc0
>> [09:35:46] RSP: 00000000a0883d98 EFLAGS: 00010206
>> [09:35:46] RAX: 0000000060aeaa28 RBX: 0000000060a2d400 RCX: 0000000000000008
>> [09:35:46] RDX: 0000000060aea9c0 RSI: 0000000000000000 RDI: 0000000060864000
>> [09:35:46] RBP: 0000000060aea9c0 R08: 0000000000000000 R09: 0000000060a2d400
>> [09:35:46] R10: 0000000000000400 R11: 0000000060a9cc00 R12: 0000000000000006
>> [09:35:46] R13: 0000000000000400 R14: 0000000000000305 R15: 0000000000000000
>> [09:35:46] Kernel panic - not syncing: Segfault with no mm
>> [09:35:46] CPU: 0 UID: 0 PID: 35 Comm: kunit_try_catch Tainted: G W N 6.16.0-rc4-00031-gbbe11dd13a3f-dirty #36 NONE
>> [09:35:46] Tainted: [W]=WARN, [N]=TEST
>> [09:35:46] Stack:
>> [09:35:46] 60210c60 00000200 60a9e400 00000400
>> [09:35:46] 40060300280 60864000 60a9cc00 60a2d400
>> [09:35:46] 00000400 60aea9c0 60a9cc00 60aea9c0
>> [09:35:46] Call Trace:
>> [09:35:46] [<60210c60>] ? ext4_mb_generate_buddy+0x1f0/0x230
>> [09:35:46] [<60215c3b>] ? test_mb_mark_used+0x28b/0x4e0
>> [09:35:46] [<601df5bc>] ? ext4_get_group_desc+0xbc/0x150
>> [09:35:46] [<600bf1c0>] ? ktime_get_ts64+0x0/0x190
>> [09:35:46] [<60086370>] ? to_kthread+0x0/0x40
>> [09:35:46] [<602b559b>] ? kunit_try_run_case+0x7b/0x100
>> [09:35:46] [<60086370>] ? to_kthread+0x0/0x40
>> [09:35:46] [<602b7850>] ? kunit_generic_run_threadfn_adapter+0x0/0x30
>> [09:35:46] [<602b7862>] ? kunit_generic_run_threadfn_adapter+0x12/0x30
>> [09:35:46] [<60086a51>] ? kthread+0xf1/0x250
>> [09:35:46] [<6004a541>] ? new_thread_handler+0x41/0x60
>> [09:35:46] [ERROR] Test: test_mb_mark_used: 0 tests run!
>> [09:35:46] ============= [NO TESTS RUN] test_mb_mark_used =============
>>
Powered by blists - more mailing lists