lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f6ae39fd-ee30-4e22-8d0d-6dec5c3bd192@gmx.com>
Date: Tue, 24 Sep 2024 07:53:50 +0930
From: Qu Wenruo <quwenruo.btrfs@....com>
To: Johannes Thumshirn <Johannes.Thumshirn@....com>,
 Johannes Thumshirn <jth@...nel.org>, Chris Mason <clm@...com>,
 Josef Bacik <josef@...icpanda.com>, David Sterba <dsterba@...e.com>,
 "open list:BTRFS FILE SYSTEM" <linux-btrfs@...r.kernel.org>,
 open list <linux-kernel@...r.kernel.org>
Cc: WenRuo Qu <wqu@...e.com>, Naohiro Aota <Naohiro.Aota@....com>
Subject: Re: [PATCH] btrfs: also add stripe entries for NOCOW writes



在 2024/9/24 00:11, Johannes Thumshirn 写道:
> On 23.09.24 10:54, Qu Wenruo wrote:
>>
>>
[...]
>> Finally, I do not think it's a good idea to insert RST entries for NOCOW.
>> If a file is set NOCOW, it means we'll doing a lot of overwrite for it.
>> Then why waste our time updating the RST entries again and again?
>>
>> Isn't such behavior going to cause more write amplification? Meanwhile
>> for non-RST cases, NOCOW should cause the least amount of write
>> amplification.
>
> The whole idea behind the RST was to write the RST entries _after_ the
> data has been persisted to disk. Otherwise we're back at the write hole
> problem. See for example this imaginary sequence:
>
> Preallocate a range. This will then also preallocate the RST entries
> with the mapping as you describe. Write to it and while you write you
> have a powerloss. The copy/stripe to disk 1 is correctly written but
> disk 2 didn't report back before the power loss happened.
> After we have
> power again, a read to disk 2 comes in, as we have a RST entry, the read
> will be directed to the broken entry and garbage is returned. And this
> is the good case, as we can repair it.
> If it was an overwrite of a block and the same happens, we have a RST
> entry pointing to a good and a bad copy.

Nope, that will not happen.

Because our metadata is still COW protected, after such powerloss, the
file extent is still showing that range is PREALLOCATED, we won't even
trigger a read.

And this is exactly the same as the non-RST PREALLOCATED write.

>
> Once we're adding the RST entries after both writes succeed the problem
> isn't there. So for preallocated extents it is even harmful to add a RST
> entry.

You just forgot the metadata part, which prevents the problem from
happening in the very beginning.

Thanks,
Qu


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ