[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240912075246.5810-1-00107082@163.com>
Date: Thu, 12 Sep 2024 15:52:46 +0800
From: David Wang <00107082@....com>
To: kent.overstreet@...ux.dev
Cc: linux-bcachefs@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [BUG?] bcachefs performance: read is way too slow when a file has no overwrite.
Hi,
> I made some debug, when performance is bad, the conditions
> bvec_iter_sectors(iter) != pick.crc.uncompressed_size and
> bvec_iter_sectors(iter) != pick.crc.live_size are "almost" always both "true",
> while when performance is good (after "thorough" write), they are only little
> percent (~350 out of 1000000) to be true.
>
> And if those conditions are "true", "bounce" would be set and code seems to run
> on a time consuming path.
>
> I suspect "merely read" could never change those conditions, but "write" can?
>
More update:
1. Without a "thorough" write, it seems no matter what the prepare write size is,
crc.compressed_size is always 128 sectors = 64K?
2. With a "thorough" write with 4K block size, crc.compressed_size mostly descreases to 4K,
only a few crc.compressed_size left with 8/12/16/20K...
3. If a 4K-thorough-write followed by 40K-thorough-write, crc.compressed_size then
increases to 40K, and 4K direct read suffers again....
4. A 40K-through-write followed by 256K-thorough-write, crc.compressed_size only
increase to 64K, I guess 64K is maximum crc.compressed_size.
So I think current conclusion is:
1. The initial crc.compressed_size is always 64K when file was created/prepared.
2. Afterward writes can change crc size based on write size. (optimized for write?)
3. Direct read performance is sensitive to this crc size, more test result:
+-----------+--------+----------+
| rand read | IOPS | BW |
+-----------+--------+----------+
| 4K !E | 24.7K | 101MB/s |
| 16K !E | 24.7K | 404MB/s |
| 64K !E | 24.7K | 1617MB/s |
| 4K E | ~220K | ~900MB/s |
| 16K E | ~55K | ~900MB/s |
| 64K E | ~13.8K | ~900MB/s |
+-----------+--------+----------+
E stands for the event that a "thorough" 4k write happened before the test.
Or put it more specific:
E: lots of rand 4k-write, crc.compressed_size = 4K
!E: file was just created, crc.compressed_size = 64K
The behavior seems reasonable from write's point of view, but for read it
dose not sounds good....If a mmaped readonly file, page in less than
16 pages, those extra data would waste lots of disk bandwidth.
David
Powered by blists - more mailing lists