linux-kernel - Re: [PATCH] bcache: consider the fragmentation when update the writeback rate

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <35a038d8-fe6b-7954-f2d9-be74eb32dcdd@suse.de>
Date:   Tue, 15 Dec 2020 01:07:31 +0800
From:   Coly Li <colyli@...e.de>
To:     Dongdong Tao <dongdong.tao@...onical.com>
Cc:     Gavin Guo <gavin.guo@...onical.com>,
        Gerald Yang <gerald.yang@...onical.com>,
        Trent Lloyd <trent.lloyd@...onical.com>,
        Kent Overstreet <kent.overstreet@...il.com>,
        "open list:BCACHE (BLOCK LAYER CACHE)" <linux-bcache@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>,
        Dominique Poulain <dominique.poulain@...onical.com>,
        Dongsheng Yang <dongsheng.yang@...ystack.cn>
Subject: Re: [PATCH] bcache: consider the fragmentation when update the
 writeback rate

On 12/14/20 11:30 PM, Dongdong Tao wrote:
> Hi Coly and Dongsheng,
> 
> I've get the testing result and confirmed that this testing result is
> reproducible by repeating it many times.
> I ran fio to get the write latency log and parsed the log and then
> generated below latency graphs with some visualization tool
> 

Hi Dongdong,

Thank you so much for the performance number!

[snipped]
> So, my code will accelerate the writeback process when the dirty buckets
> exceeds 50%( can be tuned), as we can see
> the cache_available_percent does increase once it hit 50, so we won't
> hit writeback cutoff issue.
> 
> Below are the steps that I used to do the experiment:
> 1. make-bcache -B <hdd> -C <nvme> --writeback -- I prepared the nvme
> size to 1G, so it can be reproduced faster
> 
> 2. sudo fio --name=random-writers --filename=/dev/bcache0
> --ioengine=libaio --iodepth=1 --rw=randrw --bs=16k --direct=1
> --rate_iops=90,10 --numjobs=1 --write_lat_log=16k
> 
> 3. For 1 G nvme, running for about 20 minutes is enough get the data. 

1GB cache and 20 minutes is quite limited for the performance valuation.
Could you please to do similar testing with 1TB SSD and 1 hours for each
run of the benchmark ?

> 
> Using randrw with rate_iops=90,10 is just one way to reproduce this
> easily, this can be reproduced as long as we can create a fragmented
> situation that quite few dirty data consumed a lot dirty buckets thus
> killing the write performance.
> 

Yes this is a good method to generate dirty data segments.

> This bug nowadays is becoming very critical, as ceph is hitting it, ceph
> mostly submitting random small IO.
> Please let me know what you need in order to move forward in this
> direction, I'm sure this patch can be improved also.

The performance number is quite convinced and the idea in your patch is
promising.

I will provide my comments on your patch after we see the performance
number for larger cache device and longer run time.

Thanks again for the detailed performance number, which is really
desired for performance optimization changes :-)

Coly Li