[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1d0788af-bbac-44e6-8954-af7810fbb101@suse.com>
Date: Tue, 7 Jan 2025 08:30:05 +1030
From: Qu Wenruo <wqu@...e.com>
To: Daniel Vacek <neelx@...e.com>
Cc: Christian Brauner <brauner@...nel.org>, Qu Wenruo
<quwenruo.btrfs@....com>, linux-fsdevel@...r.kernel.org,
linux-btrfs <linux-btrfs@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: mnt_list corruption triggered during btrfs/326
在 2025/1/7 08:20, Daniel Vacek 写道:
> On Sat, 4 Jan 2025 at 23:26, Qu Wenruo <wqu@...e.com> wrote:
>>
>>
>>
>> 在 2025/1/4 21:56, Christian Brauner 写道:
>>> On Wed, Jan 01, 2025 at 07:05:10AM +1030, Qu Wenruo wrote:
>>>>
>>>>
>>>> 在 2024/12/30 19:59, Qu Wenruo 写道:
>>>>> Hi,
>>>>>
>>>>> Although I know it's triggered from btrfs, but the mnt_list handling is
>>>>> out of btrfs' control, so I'm here asking for some help.
>>>
>>> Thanks for the report.
>>>
>>>>>
>>>>> [BUG]
>>>>> With CONFIG_DEBUG_LIST and CONFIG_BUG_ON_DATA_CORRUPTION, and an
>>>>> upstream 6.13-rc kernel, which has commit 951a3f59d268 ("btrfs: fix
>>>>> mount failure due to remount races"), I can hit the following crash,
>>>>> with varied frequency (from 1/4 to hundreds runs no crash):
>>>>
>>>> There is also another WARNING triggered, without btrfs callback involved
>>>> at all:
>>>>
>>>> [ 192.688671] ------------[ cut here ]------------
>>>> [ 192.690016] WARNING: CPU: 3 PID: 59747 at fs/mount.h:150
>>>
>>> This would indicate that move_from_ns() was called on a mount that isn't
>>> attached to a mount namespace (anymore or never has).
>>>
>>> Here's it's particularly peculiar because it looks like the warning is
>>> caused by calling move_from_ns() when moving a mount from an anonymous
>>> mount namespace in attach_recursive_mnt().
>>>
>>> Can you please try and reproduce this with
>>> commit 211364bef4301838b2e1 ("fs: kill MNT_ONRB")
>>> from the vfs-6.14.mount branch in
>>> https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git ?
>>>
>>
>> After the initial 1000 runs (with 951a3f59d268 ("btrfs: fix mount
>> failure due to remount races") cherry picked, or it won't pass that test
>> case), there is no crash nor warning so far.
>>
>> It's already the best run so far, but I'll keep it running for another
>> day or so just to be extra safe.
>>
>> So I guess the offending commit is 2eea9ce4310d ("mounts: keep list of
>> mounts in an rbtree")?
>
> This one was merged in v6.8 - why would it cause crashes only now?
Because in v6.8 btrfs also migrated to the new mount API, which caused
the ro/rw mount race which can fail the mount.
That's exactly why the test case is introduced.
Before the recent ro/rw mount fix, the test case won't go that far but
error out early so we don't have enough loops to trigger the bug.
>
>> Putting a list and rb_tree into a union indeed seems a little dangerous,
>> sorry I didn't notice that earlier, but my vmcore indeed show a
>> seemingly valid mnt_node (color = 1, both left/right are NULL).
>
> The union seems fine to me as long as the `MNT_ONRB` bit stays
> consistent. The crashes (nor warnings) are simply caused by the flag
> missing where it should have been set.
That also means the mnt_flag needs to be properly protected, at least
with the same level of mnt_list/mnt_node.
But a lot of time such flag is atomically accessed using
test/set/clear_bit(), without the same level of lock protection.
So my current uneducated guess is, there is a race window where the flag
and member got de-synced.
Thus it's not as safe as a non-unioned member.
Thanks,
Qu
>
> --nX
>
>> Thanks a lot for the fix, and it's really a huge relief that it's not
>> something inside btrfs causing the bug.
>>
>> Thanks,
>> Qu
>>
>> [...]
>>>>>
>>>>> The only caller doesn't hold @mount_lock is iterate_mounts() but that's
>>>>> only called from audit, and I'm not sure if audit is even involved in
>>>>> this case.
>>>
>>> This is fine as audit creates a private copy of the mount tree it is
>>> interested in. The mount tree is not visible to other callers anymore.
>>>
>>>>>
>>>>> So I ran out of ideas why this mnt_list can even happen.
>>>>>
>>>>> Even if it's some btrfs' abuse, all mnt_list users are properly
>>>>> protected thus it should not lead to such list corruption.
>>>>>
>>>>> Any advice would be appreciated.
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>
>>>
>>
>>
Powered by blists - more mailing lists