linux-kernel - Re: [BUG?] bcachefs performance: read is way too slow when a file has no overwrite.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20240912075246.5810-1-00107082@163.com>
Date: Thu, 12 Sep 2024 15:52:46 +0800
From: David Wang <00107082@....com>
To: kent.overstreet@...ux.dev
Cc: linux-bcachefs@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [BUG?] bcachefs performance: read is way too slow when a file has no overwrite.

Hi, 

> I made some debug, when performance is bad, the conditions
> bvec_iter_sectors(iter) != pick.crc.uncompressed_size and 
> bvec_iter_sectors(iter) != pick.crc.live_size are "almost" always both "true",
> while when performance is good (after "thorough" write), they are only little
> percent (~350 out of 1000000)  to be true.
> 
> And if those conditions are "true", "bounce" would be set and code seems to run
> on a time consuming path.
> 
> I suspect "merely read" could never change those conditions, but "write" can?
> 

More update: 

1. Without a "thorough" write, it seems no matter what the prepare write size is,
crc.compressed_size is always 128 sectors = 64K?
2. With a "thorough" write with 4K block size, crc.compressed_size mostly descreases to 4K,
only a few crc.compressed_size left with 8/12/16/20K...
3. If a 4K-thorough-write followed by 40K-thorough-write, crc.compressed_size then 
increases to 40K, and 4K direct read suffers again....
4. A 40K-through-write followed by 256K-thorough-write, crc.compressed_size only
increase to 64K, I guess 64K is maximum crc.compressed_size.

So I think current conclusion is:
1. The initial crc.compressed_size is always 64K when file was created/prepared.
2. Afterward writes can change crc size based on write size. (optimized for write?)
3. Direct read performance is sensitive to this crc size, more test result:
	+-----------+--------+----------+
	| rand read |  IOPS  |    BW    |
	+-----------+--------+----------+
	|   4K !E   | 24.7K  | 101MB/s  |
	|   16K !E  | 24.7K  | 404MB/s  |
	|   64K !E  | 24.7K  | 1617MB/s |
	|    4K E   | ~220K  | ~900MB/s |
	|   16K E   |  ~55K  | ~900MB/s |
	|   64K E   | ~13.8K | ~900MB/s |
	+-----------+--------+----------+
E stands for the event that a "thorough" 4k write happened before the test.
Or put it more specific:
E: lots of rand 4k-write, crc.compressed_size = 4K
!E: file was just created, crc.compressed_size = 64K

The behavior seems reasonable from write's point of view, but for read it
dose not sounds good....If a mmaped readonly file, page in less than
16 pages, those extra data would waste lots of disk bandwidth.

David