linux-kernel - Re: [PATCH v4 00/14] forcealign for xfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <333e6a95-b827-4938-9477-ad5ff5398cbe@oracle.com>
Date: Fri, 29 Nov 2024 11:36:13 +0000
From: John Garry <john.g.garry@...cle.com>
To: Christoph Hellwig <hch@....de>
Cc: Dave Chinner <david@...morbit.com>,
        Ritesh Harjani
 <ritesh.list@...il.com>, chandan.babu@...cle.com,
        djwong@...nel.org, dchinner@...hat.com, viro@...iv.linux.org.uk,
        brauner@...nel.org, jack@...e.cz, linux-xfs@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        catherine.hoang@...cle.com, martin.petersen@...cle.com
Subject: Re: [PATCH v4 00/14] forcealign for xfs

On 24/09/2024 10:48, John Garry wrote:
>>
>>>
>>>> but more importantly not introducing
>>>> additional complexities by requiring to be able to write over the
>>>> written/unwritten boundaries created by either rtextentsize > 1 or
>>>> the forcealign stuff if you actually want atomic writes.
>>>
>>> The very original solution required a single mapping and in written 
>>> state
>>> for atomic writes. Reverting to that would save a lot of hassle in the
>>> kernel. It just means that the user needs to manually pre-zero.
>>
>> What atomic I/O sizes do your users require?  Would they fit into
>> a large sector size now supported by XFS (i.e. 32k for now).
>>
> 
> It could be used, but then we have 16KB filesystem block size, which 
> some just may not want. And we just don't want 16KB sector size, but I 
> don't think that we require that if we use RWF_ATOMIC.

Hi Christoph,

I want to come back to this topic of forcealign.

We have been doing much MySQL workload testing with following 
configurations:
a. 4k FS blocksize and RT 16K rextsize
b. 4k FS blocksize and forcealign 16K extsize
c. 16K FS blocksize

a. and b. show comparable performance, with b. slightly better overall. 
Generally c. shows significantly slower performance at lower thread 
count/lower load testing. We put that down to MySQL REDO log write 
amplification from larger FS blocksize. At higher loads, performance is 
comparable.

So we tried:
d. 4K FS blocksize for REDO log on 1x partition and 16K FS blocksize for 
DB pagesize atomic writes on 1x partition

For d., performance was good and comparable to a. and b., if not overall 
a bit better.

Unfortunately d. does not allow us to do a single FS snapshot, so not of 
much value for production.

I was talking to Martin on this log write topic, and he considers that 
there are many other scenarios where a larger FS blocksize can affect 
log writes latency, so quite undesirable (to have a large FS blocksize).

So we would still like to try for forcealign.

However, enabling large atomic writes for rtvol is quite simple and 
there would be overlap with enabling large atomic writes for forcealign 
- see 
https://github.com/johnpgarry/linux/commits/atomic-write-large-atomics-pre-v6.13/ 
- so I am thinking of trying for that first.

Let me know what you think.

Thanks,
John