linux-kernel - Re: [PATCH] bcache: consider the fragmentation when update the writeback rate

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <392abd73-c58a-0a34-bd21-1e9adfffc870@suse.de>
Date:   Thu, 14 Jan 2021 18:05:41 +0800
From:   Coly Li <colyli@...e.de>
To:     Dongdong Tao <dongdong.tao@...onical.com>
Cc:     Kent Overstreet <kent.overstreet@...il.com>,
        "open list:BCACHE (BLOCK LAYER CACHE)" <linux-bcache@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>,
        Gavin Guo <gavin.guo@...onical.com>,
        Gerald Yang <gerald.yang@...onical.com>,
        Trent Lloyd <trent.lloyd@...onical.com>,
        Dominique Poulain <dominique.poulain@...onical.com>,
        Dongsheng Yang <dongsheng.yang@...ystack.cn>
Subject: Re: [PATCH] bcache: consider the fragmentation when update the
 writeback rate

On 1/14/21 12:45 PM, Dongdong Tao wrote:
> Hi Coly,
> 
> I've got the testing data for multiple threads with larger IO depth.
> 

Hi Dongdong,

Thanks for the testing number.

> *Here is the testing steps:
> *1. make-bcache -B <> -C <> --writeback
> 
> 2. Open two tabs, start different fio task in them at the same time.
> Tab1 run below fio command:
> sudo fio --name=random-writers --filename=/dev/bcache0 --ioengine=libaio
> --iodepth=32 --rw=randrw --blocksize=64k,8k  --direct=1 --runtime=24000
> 
> Tab2 run below fio command:
> sudo fio --name=random-writers2 --filename=/dev/bcache0
> --ioengine=libaio --iodepth=8 --rw=randwrite --bs=4k --rate_iops=150
> --direct=1 --write_lat_log=rw --log_avg_msec=20
> 


Why you limit the iodep to 8 and iops to 150 on cache device?
For cache device the limitation is small. Iosp 150 with 4KB block size,
it means every hour writing (150*4*60*60=2160000KB=) 2GB data. For 35
hours it is only 70GB.


What if the iodeps is 128 or 64, and no iops rate limitation ?


> Note
> - Tab1 fio will run for 24000 seconds, which is the one to cause the
> fragmentation and made the cache_available_percent drops to under 40.
> - Tab2 fio is the one that I'm capturing the latency and I have let it
> run for about 35 hours, which is long enough to allow the
> cache_available_percent drops under 30.
> - This testing method utilized fio benchmark with larger read block
> size/small write block size to cause the high fragmentation, However in
> a real production env, there could be
>    various reasons or a combination of various reasons to cause the high
> fragmentation,  but I believe it should be ok to use any method to cause
> the fragmentation to verify if
>    bcache with this patch is responding better than the master in this
> situation. 
> 
> *Below is the testing result:*
> 
> The total run time is about 35 hours, the latency points in the charts
> for each run are 1.5 million
> 
> Master:
> fio-lat-mater.png
> 
> Master + patch:
> fio-lat-patch.png
> Combine them together:
> fio-lat-mix.png
> 
> Now we can see the master is even worse when we increase the iodepth,
> which makes sense since the backing HDD is being stressed more hardly.
> 
> *Below are the cache stats changing during the run:*
> Master:
> bcache-stats-master.png
> 
> Master + the patch:
> bcache-stats-patch.png
> 
> That's all the testing done with 400GB NVME with 512B block size.
> 
> Coly, do you want me to continue the same testing on 1TB nvme with
> different block size ?
> or is it ok to skip the 1TB testing and continue the test with 400GB
> NVME but with different block size? 
> feel free to let me know any other test scenarios that we should cover
> here. 

Yes please, more testing is desired for performance improvement. So far
I don't see performance number for real high work load yet.

Thanks.

Coly Li