[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <feead66e-5b83-7e54-1164-c7c61e78e7be@huaweicloud.com>
Date: Wed, 14 Aug 2024 10:14:01 +0800
From: Zhang Yi <yi.zhang@...weicloud.com>
To: Dave Chinner <david@...morbit.com>
Cc: linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, djwong@...nel.org, hch@...radead.org,
brauner@...nel.org, jack@...e.cz, willy@...radead.org, yi.zhang@...wei.com,
chengzhihao1@...wei.com, yukuai3@...wei.com
Subject: Re: [PATCH v2 0/6] iomap: some minor non-critical fixes and
improvements when block size < folio size
On 2024/8/14 9:49, Dave Chinner wrote:
> On Mon, Aug 12, 2024 at 08:11:53PM +0800, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@...wei.com>
>>
>> Changes since v1:
>> - Patch 5 fix a stale data exposure problem pointed out by Willy, drop
>> the setting of uptodate bits after zeroing out unaligned range.
>> - As Dave suggested, in order to prevent increasing the complexity of
>> maintain the state_lock, don't just drop all the state_lock in the
>> buffered write path, patch 6 introduce a new helper to set uptodate
>> bit and dirty bits together under the state_lock, reduce one time of
>> locking per write, the benefits of performance optimization do not
>> change too much.
>
> It's helpful to provide a lore link to the previous version so that
> reviewers don't have to go looking for it themselves to remind them
> of what was discussed last time.
>
> https://lore.kernel.org/linux-xfs/20240731091305.2896873-1-yi.zhang@huaweicloud.com/T/
Sure, will add in my later iterations.
>
>> This series contains some minor non-critical fixes and performance
>> improvements on the filesystem with block size < folio size.
>>
>> The first 4 patches fix the handling of setting and clearing folio ifs
>> dirty bits when mark the folio dirty and when invalidat the folio.
>> Although none of these code mistakes caused a real problem now, it's
>> still deserve a fix to correct the behavior.
>>
>> The second 2 patches drop the unnecessary state_lock in ifs when setting
>> and clearing dirty/uptodate bits in the buffered write path, it could
>> improve some (~8% on my machine) buffer write performance. I tested it
>> through UnixBench on my x86_64 (Xeon Gold 6151) and arm64 (Kunpeng-920)
>> virtual machine with 50GB ramdisk and xfs filesystem, the results shows
>> below.
>>
>> UnixBench test cmd:
>> ./Run -i 1 -c 1 fstime-w
>>
>> Before:
>> x86 File Write 1024 bufsize 2000 maxblocks 524708.0 KBps
>> arm64 File Write 1024 bufsize 2000 maxblocks 801965.0 KBps
>>
>> After:
>> x86 File Write 1024 bufsize 2000 maxblocks 569218.0 KBps
>> arm64 File Write 1024 bufsize 2000 maxblocks 871605.0 KBps
>
> Those are the same performance numbers as you posted for the
> previous version of the patch. How does this new version perform
> given that it's a complete rework of the optimisation? It's
It's not exactly the same, but the difference is small, I've updated
the performance number in this cover letter.
> important to know if the changes made actually provided the benefit
> we expected them to make....
>
> i.e. this is the sort of table of results I'd like to see provided:
>
> platform base v1 v2
> x86 524708.0 569218.0 ????
> arm64 801965.0 871605.0 ????
>
platform base v1 v2
x86 524708.0 571315.0 569218.0
arm64 801965.0 876077.0 871605.0
Thanks,
Yi.
Powered by blists - more mailing lists