[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a08a9491-61d7-b300-55ba-b016dd5aad5a@huaweicloud.com>
Date: Wed, 14 Aug 2024 11:57:03 +0800
From: Zhang Yi <yi.zhang@...weicloud.com>
To: Dave Chinner <david@...morbit.com>
Cc: linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, djwong@...nel.org, hch@...radead.org,
brauner@...nel.org, jack@...e.cz, willy@...radead.org, yi.zhang@...wei.com,
chengzhihao1@...wei.com, yukuai3@...wei.com
Subject: Re: [PATCH v2 0/6] iomap: some minor non-critical fixes and
improvements when block size < folio size
On 2024/8/14 10:47, Dave Chinner wrote:
> On Wed, Aug 14, 2024 at 10:14:01AM +0800, Zhang Yi wrote:
>> On 2024/8/14 9:49, Dave Chinner wrote:
>>> important to know if the changes made actually provided the benefit
>>> we expected them to make....
>>>
>>> i.e. this is the sort of table of results I'd like to see provided:
>>>
>>> platform base v1 v2
>>> x86 524708.0 569218.0 ????
>>> arm64 801965.0 871605.0 ????
>>>
>>
>> platform base v1 v2
>> x86 524708.0 571315.0 569218.0
>> arm64 801965.0 876077.0 871605.0
>
> So avoiding the lock cycle in iomap_write_begin() (in patch 5) in
> this partial block write workload made no difference to performance
> at all, and removing a lock cycle in iomap_write_end provided all
> that gain?
Yes.
>
> Is this an overwrite workload or a file extending workload? The
> result implies that iomap_block_needs_zeroing() is returning false,
> hence it's an overwrite workload and it's reading partial blocks
> from disk. i.e. it is doing synchronous RMW cycles from the ramdisk
> and so still calling the uptodate bitmap update function rather than
> hitting the zeroing case and skipping it.
>
> Hence I'm just trying to understand what the test is doing because
> that tells me what the result should be...
>
I forgot to mentioned that I test this on xfs with 1K block size, this
is a simple case of block size < folio size that I can direct use
UnixBench.
This test first do buffered append write with bs=1K,count=2000 in the
first round, and then do overwrite from the start position with the same
parameters repetitively in 30 seconds. All the write operations are
block size aligned, so iomap_write_begin() just continue after
iomap_adjust_read_range(), don't call iomap_set_range_uptodate() to set
range uptodate originally, hence there is no difference whether with or
without patch 5 in this test case.
Thanks,
Yi.
Powered by blists - more mailing lists