lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e9c5bb00-b609-aff9-fc95-ca1c5b9c2899@gmail.com>
Date:   Fri, 20 Aug 2021 11:09:05 +0800
From:   Desmond Cheong Zhi Xi <desmondcheongzx@...il.com>
To:     dsterba@...e.cz, clm@...com, josef@...icpanda.com,
        dsterba@...e.com, anand.jain@...cle.com,
        linux-btrfs@...r.kernel.org, linux-kernel@...r.kernel.org,
        skhan@...uxfoundation.org, gregkh@...uxfoundation.org,
        linux-kernel-mentees@...ts.linuxfoundation.org,
        syzbot+a70e2ad0879f160b9217@...kaller.appspotmail.com
Subject: Re: [PATCH v2] btrfs: fix rw device counting in
 __btrfs_free_extra_devids

On 20/8/21 1:34 am, David Sterba wrote:
> On Fri, Aug 20, 2021 at 01:11:58AM +0800, Desmond Cheong Zhi Xi wrote:
>>>>> The option #2 does not sound safe because the TGT bit is checked in
>>>>> several places where device list is queried for various reasons, even
>>>>> without a mounted filesystem.
>>>>>
>>>>> Removing the assertion makes more sense but I'm still not convinced that
>>>>> the this is expected/allowed state of a closed device.
>>>>>
>>>>
>>>> Would it be better if we cleared the REPLACE_TGT bit only when closing
>>>> the device where device->devid == BTRFS_DEV_REPLACE_DEVID?
>>>>
>>>> The first conditional in btrfs_close_one_device assumes that we can come
>>>> across such a device. If we come across it, we should properly reset it.
>>>>
>>>> If other devices has this bit set, the ASSERT will still catch it and
>>>> let us know something is wrong.
>>>
>>> That sounds great.
>>>
>>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>>> index 70f94b75f25a..a5afebb78ecf 100644
>>>> --- a/fs/btrfs/volumes.c
>>>> +++ b/fs/btrfs/volumes.c
>>>> @@ -1130,6 +1130,9 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>>>>                    fs_devices->rw_devices--;
>>>>            }
>>>>     
>>>> +       if (device->devid == BTRFS_DEV_REPLACE_DEVID)
>>>> +               clear_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state);
>>>> +
>>>>            if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
>>>>                    fs_devices->missing_devices--;
>>>
>>> I'll do a few test rounds, thanks.
>>
>> Just following up. Did that resolve the issue or is further
>> investigation needed?
> 
> The fix seems to work, I haven't seen the assertion fail anymore,
> incidentally the crash also stopped to show up on an unpatched branch.
> 

Sounds good, thanks for the update. If there's anything else I can help 
with, please let me know.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ