linux-kernel - Re: [PATCH v6 11/12] xfs: add xfs_compute_atomic_write_unit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ed53dc33-c811-4c20-8713-8d2d32cb81d7@oracle.com>
Date: Wed, 9 Apr 2025 09:15:23 +0100
From: John Garry <john.g.garry@...cle.com>
To: Dave Chinner <david@...morbit.com>, "Darrick J. Wong" <djwong@...nel.org>
Cc: brauner@...nel.org, hch@....de, viro@...iv.linux.org.uk, jack@...e.cz,
        cem@...nel.org, linux-fsdevel@...r.kernel.org, dchinner@...hat.com,
        linux-xfs@...r.kernel.org, linux-kernel@...r.kernel.org,
        ojaswin@...ux.ibm.com, ritesh.list@...il.com,
        martin.petersen@...cle.com, linux-ext4@...r.kernel.org,
        linux-block@...r.kernel.org, catherine.hoang@...cle.com
Subject: Re: [PATCH v6 11/12] xfs: add xfs_compute_atomic_write_unit_max()

On 09/04/2025 06:30, Dave Chinner wrote:
>> This is why I don't agree with adding a static 16MB limit -- we clearly
>> don't need it to emulate current hardware, which can commit up to 64k
>> atomically.  Future hardware can increase that by 64x and we'll still be
>> ok with using the existing tr_write transaction type.
>>
>> By contrast, adding a 16MB limit would result in a much larger minimum
>> log size.  If we add that to struct xfs_trans_resv for all filesystems
>> then we run the risk of some ancient filesystem with a 12M log failing
>> suddenly failing to mount on a new kernel.
>>
>> I don't see the point.
> You've got stuck on ithe example size of 16MB I gave, not
> the actual reason I gave that example.

You did provide a relatively large value in 16MB. When I say relative, I 
mean relative to what can be achieved with HW offload today.

The target user we see for this feature is DBs, and they want to do 
writes in the 16/32/64KB size range. Indeed, these are the sort of sizes 
we see supported in terms of disk atomic write support today.

Furthermore, they (DBs) want fast and predictable performance which HW 
offload provides. They do not want to use a slow software-based 
solution. Such a software-based solution will always be slower, as we 
need to deal with block alloc/de-alloc and extent remapping for every write.

So are there people who really want very large atomic write support and 
will tolerate slow performance, i.e. slower than what can be achieved 
with double-write buffer or some other application logging?

Thanks,
John