[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <79f17c7a.65f.19217621c47.Coremail.00107082@163.com>
Date: Sun, 22 Sep 2024 09:39:18 +0800 (CST)
From: "David Wang" <00107082@....com>
To: "Kent Overstreet" <kent.overstreet@...ux.dev>
Cc: linux-bcachefs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [BUG?] bcachefs performance: read is way too slow when a file
has no overwrite.
Hi,
At 2024-09-22 00:12:01, "Kent Overstreet" <kent.overstreet@...ux.dev> wrote:
>On Sun, Sep 22, 2024 at 12:02:07AM GMT, David Wang wrote:
>> Hi,
>>
>> At 2024-09-09 21:37:35, "Kent Overstreet" <kent.overstreet@...ux.dev> wrote:
>> >On Sat, Sep 07, 2024 at 06:34:37PM GMT, David Wang wrote:
>>
>> >
>> >Big standard deviation (high tail latency?) is something we'd want to
>> >track down. There's a bunch of time_stats in sysfs, but they're mostly
>> >for the write paths. If you're trying to identify where the latencies
>> >are coming from, we can look at adding some new time stats to isolate.
>>
>> About performance, I have a theory based on some observation I made recently:
>> When user space app make a 4k(8 sectors) direct write,
>> bcachefs would initiate a write request of ~11 sectors, including the checksum data, right?
>> This may not be a good offset+size pattern of block layer for performance.
>> (I did get a very-very bad performance on ext4 if write with 5K size.)
>
>The checksum isn't inline with the data, it's stored with the pointer -
>so if you're seeing 11 sector writes, something really odd is going
>on...
>
.... This is really contradict with my observation:
1. fio stats yields a average 50K IOPS for a 400 seconds random direct write test.
2. from /proc/diskstatas, average "Field 5 -- # of writes completed" per second is also 50K
(Here I conclude the performance issue is not caused by extra IOPS for checksum.)
3. from "Field 10 -- # of milliseconds spent doing I/Os", average disk "busy" time per second is about ~0.9second, similar to the result of ext4 test.
(Here I conclude the performance issue it not caused by not pushing disk device too hard.)
4. delta(Field 7 -- # of sectors written) / delta(Field 5 -- # of writes completed) for 5 minutes interval is 11 sectors/write.
(This is why I draw the theory that the checksum is with raw data......I thought is was a reasonable...)
I will make some debug code to collect sector number patterns.
>I would suggest doing some testing with data checksums off first, to
>isolate the issue; then it sounds like that IO pattern needs to be
>looked at.
I will try it.
>
>Check the extents btree in debugfs as well, to make sure the extents are
>getting written out as you think they are.
Thanks
David
Powered by blists - more mailing lists