lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8b9e31f4-0ec6-4817-8214-4dfc4e988265@wdc.com>
Date: Tue, 11 Nov 2025 08:31:30 +0000
From: Hans Holmberg <Hans.Holmberg@....com>
To: hch <hch@....de>, Florian Weimer <fweimer@...hat.com>
CC: "linux-xfs@...r.kernel.org" <linux-xfs@...r.kernel.org>, Carlos Maiolino
	<cem@...nel.org>, Dave Chinner <david@...morbit.com>, "Darrick J . Wong"
	<djwong@...nel.org>, "linux-fsdevel@...r.kernel.org"
	<linux-fsdevel@...r.kernel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "libc-alpha@...rceware.org"
	<libc-alpha@...rceware.org>, Matthew Wilcox <willy@...radead.org>
Subject: Re: [RFC] xfs: fake fallocate success for always CoW inodes

On 06/11/2025 15:46, Christoph Hellwig wrote:
> On Thu, Nov 06, 2025 at 02:42:30PM +0000, Matthew Wilcox wrote:
>> On Thu, Nov 06, 2025 at 02:52:12PM +0100, Christoph Hellwig wrote:
>>> On Thu, Nov 06, 2025 at 02:48:12PM +0100, Florian Weimer wrote:
>>>> * Hans Holmberg:
>>>>
>>>>> We don't support preallocations for CoW inodes and we currently fail
>>>>> with -EOPNOTSUPP, but this causes an issue for users of glibc's
>>>>> posix_fallocate[1]. If fallocate fails, posix_fallocate falls back on
>>>>> writing actual data into the range to try to allocate blocks that way.
>>>>> That does not actually gurantee anything for CoW inodes however as we
>>>>> write out of place.
>>>> Why doesn't fallocate trigger the copy instead?  Isn't this what the
>>>> user is requesting?
>>> What copy?
>> I believe Florian is thinking of CoW in the sense of "share while read
>> only, then you have a mutable block allocation", rather than the
>> WAFL (or SMR) sense of "we always put writes in a new location".
> Note that the glibc posix_fallocate(3( fallback will never copy anyway.
> It does a racy check and somewhat broken check if there is already
> data, and if it thinks there isn't it writes zeroes.  Which is the
> wrong thing for just about every use case imaginable.  And the only
> thing to stop it from doing that is to implement fallocate(2) and
> return success.

In stead of returning success in fallocate(2), could we in stead return
an distinct error code that would tell the caller that:

The optimized allocation not supported, AND there is no use trying to
preallocate data using writes?

EUSELESS would be nice to have, but that is not available.

Then posix_fallocate could fail with -EINVAL (which looks legit according
to the man page "the underlying filesystem does not support the operation")
or skip the writes and return success (whatever is preferable)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ