[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24c7304c-9a38-278b-0ae5-78edb225cb4c@oracle.com>
Date: Sun, 25 Jul 2021 21:49:17 +0800
From: Anand Jain <anand.jain@...cle.com>
To: dsterba@...e.cz, dsterba@...e.com
Cc: Desmond Cheong Zhi Xi <desmondcheongzx@...il.com>, clm@...com,
josef@...icpanda.com, linux-btrfs@...r.kernel.org,
linux-kernel@...r.kernel.org, skhan@...uxfoundation.org,
gregkh@...uxfoundation.org,
linux-kernel-mentees@...ts.linuxfoundation.org,
syzbot+a70e2ad0879f160b9217@...kaller.appspotmail.com
Subject: Re: [PATCH] btrfs: fix rw device counting in
__btrfs_free_extra_devids
On 22/07/2021 01:59, David Sterba wrote:
> On Thu, Jul 15, 2021 at 06:34:03PM +0800, Desmond Cheong Zhi Xi wrote:
>> Syzbot reports a warning in close_fs_devices that happens because
>> fs_devices->rw_devices is not 0 after calling btrfs_close_one_device
>> on each device.
>>
>> This happens when a writeable device is removed in
>> __btrfs_free_extra_devids, but the rw device count is not decremented
>> accordingly. So when close_fs_devices is called, the removed device is
>> still counted and we get an off by 1 error.
>>
>> Here is one call trace that was observed:
>> btrfs_mount_root():
>> btrfs_scan_one_device():
>> device_list_add(); <---------------- device added
>> btrfs_open_devices():
>> open_fs_devices():
>> btrfs_open_one_device(); <-------- rw device count ++
>> btrfs_fill_super():
>> open_ctree():
>> btrfs_free_extra_devids():
>> __btrfs_free_extra_devids(); <--- device removed
>> fail_tree_roots:
>> btrfs_close_devices():
>> close_fs_devices(); <------- rw device count off by 1
>>
>> Fixes: cf89af146b7e ("btrfs: dev-replace: fail mount if we don't have replace item with target device")
>
> What this patch did in the last hunk was the rw_devices decrement, but
> conditional:
>
> @@ -1080,9 +1071,6 @@ static void __btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices,
> if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
> list_del_init(&device->dev_alloc_list);
> clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
> - if (!test_bit(BTRFS_DEV_STATE_REPLACE_TGT,
> - &device->dev_state))
This condition was wrong.
The 1st roll of this patch which is here [1], has the details of why. As
shown below -
[1]
https://patchwork.kernel.org/project/linux-btrfs/patch/b3a0a629df98bd044a1fd5c4964f381ff6e7aa05.1600777827.git.anand.jain@oracle.com/#23640775
----
rw_devices is incremented in btrfs_open_one_device() for all write-able
devices except for devid == BTRFS_DEV_REPLACE_DEVID.
But while we clean up the extra devices in __btrfs_free_extra_devids()
we used the BTRFS_DEV_STATE_REPLACE_TGT flag isn't set because
there isn't the replace-item. So rw_devices went below zero.
----
> - fs_devices->rw_devices--;
> }
> list_del_init(&device->dev_list);
> fs_devices->num_devices--;
> ---
>
>
>> @@ -1078,6 +1078,7 @@ static void __btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices,
>> if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
>> list_del_init(&device->dev_alloc_list);
>> clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
>> + fs_devices->rw_devices--;
>> }
>> list_del_init(&device->dev_list);
>> fs_devices->num_devices--;
>
> So should it be reinstated in the original form?
No. The reason is the same as above.
Only the rw_devices decrement line has to be restored.
> The rest of
> cf89af146b7e handles unexpected device replace item during mount.
> Adding the decrement is correct, but right now I'm not sure about the
> corner case when teh devcie has the BTRFS_DEV_STATE_REPLACE_TGT bit set.
BTRFS_DEV_STATE_REPLACE_TGT is set (on BTRFS_DEV_REPLACE_DEVID) for two
reasons when we call replace through ioctl or during mount upon finding
a replace-device item.
> The state machine of the device bits and counters is not trivial so
> fixing it one way or the other could lead to further syzbot reports if
> we don't understand the issue.
I agree. Also, a good idea to convert this sysbot test into an xfstests
case.
Thanks, Anand
Powered by blists - more mailing lists