[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ce04461c-ca5c-781d-7aad-cdad3ebadac2@easystack.cn>
Date: Wed, 9 Dec 2020 15:49:26 +0800
From: Dongsheng Yang <dongsheng.yang@...ystack.cn>
To: Dongdong Tao <dongdong.tao@...onical.com>
Cc: Dongdong Tao <tdd21151186@...il.com>, Coly Li <colyli@...e.de>,
Gavin Guo <gavin.guo@...onical.com>,
Gerald Yang <gerald.yang@...onical.com>,
Trent Lloyd <trent.lloyd@...onical.com>,
Kent Overstreet <kent.overstreet@...il.com>,
"open list:BCACHE (BLOCK LAYER CACHE)" <linux-bcache@...r.kernel.org>,
open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] bcache: consider the fragmentation when update the
writeback rate
在 2020/12/9 星期三 下午 12:48, Dongdong Tao 写道:
> Hi Dongsheng,
>
> I'm working on it, next step I'm gathering some testing data and
> upload (very sorry for the delay...)
> Thanks for the comment.
> One of the main concerns to alleviate this issue with the writeback
> process is that we need to minimize the impact on the client IO
> performance.
> writeback_percent by default is 10, start writeback when dirty buckets
> reached 10 percent might be a bit too aggressive, as the
> writeback_cutoff_sync is 70 percent.
> So i chose to start the writeback when dirty buckets reached 50
> percent so that this patch will only take effect after dirty buckets
> percent is above that
Agree with that's too aggressive to reuse writeback_percent, and that's
less flexable.
Okey, let's wait for your testing result.
Thanx
>
> Thanks,
> Dongdong
>
>
>
>
> On Wed, Dec 9, 2020 at 10:27 AM Dongsheng Yang
> <dongsheng.yang@...ystack.cn> wrote:
>>
>> 在 2020/11/3 星期二 下午 8:42, Dongdong Tao 写道:
>>> From: dongdong tao <dongdong.tao@...onical.com>
>>>
>>> Current way to calculate the writeback rate only considered the
>>> dirty sectors, this usually works fine when the fragmentation
>>> is not high, but it will give us unreasonable small rate when
>>> we are under a situation that very few dirty sectors consumed
>>> a lot dirty buckets. In some case, the dirty bucekts can reached
>>> to CUTOFF_WRITEBACK_SYNC while the dirty data(sectors) noteven
>>> reached the writeback_percent, the writeback rate will still
>>> be the minimum value (4k), thus it will cause all the writes to be
>>> stucked in a non-writeback mode because of the slow writeback.
>>>
>>> This patch will try to accelerate the writeback rate when the
>>> fragmentation is high. It calculate the propotional_scaled value
>>> based on below:
>>> (dirty_sectors / writeback_rate_p_term_inverse) * fragment
>>> As we can see, the higher fragmentation will result a larger
>>> proportional_scaled value, thus cause a larger writeback rate.
>>> The fragment value is calculated based on below:
>>> (dirty_buckets * bucket_size) / dirty_sectors
>>> If you think about it, the value of fragment will be always
>>> inside [1, bucket_size].
>>>
>>> This patch only considers the fragmentation when the number of
>>> dirty_buckets reached to a dirty threshold(configurable by
>>> writeback_fragment_percent, default is 50), so bcache will
>>> remain the original behaviour before the dirty buckets reached
>>> the threshold.
>>>
>>> Signed-off-by: dongdong tao <dongdong.tao@...onical.com>
>>> ---
>>> drivers/md/bcache/bcache.h | 1 +
>>> drivers/md/bcache/sysfs.c | 6 ++++++
>>> drivers/md/bcache/writeback.c | 21 +++++++++++++++++++++
>>> 3 files changed, 28 insertions(+)
>>>
>>> diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
>>> index 1d57f48307e6..87632f7032b6 100644
>>> --- a/drivers/md/bcache/bcache.h
>>> +++ b/drivers/md/bcache/bcache.h
>>> @@ -374,6 +374,7 @@ struct cached_dev {
>>> unsigned int writeback_metadata:1;
>>> unsigned int writeback_running:1;
>>> unsigned char writeback_percent;
>>> + unsigned char writeback_fragment_percent;
>>> unsigned int writeback_delay;
>>>
>>> uint64_t writeback_rate_target;
>>> diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
>>> index 554e3afc9b68..69499113aef8 100644
>>> --- a/drivers/md/bcache/sysfs.c
>>> +++ b/drivers/md/bcache/sysfs.c
>>> @@ -115,6 +115,7 @@ rw_attribute(stop_when_cache_set_failed);
>>> rw_attribute(writeback_metadata);
>>> rw_attribute(writeback_running);
>>> rw_attribute(writeback_percent);
>>> +rw_attribute(writeback_fragment_percent);
>>
>> Hi Dongdong and Coly,
>>
>> What is the status about this patch? In my opinion, it is a problem
>> we need to solve,
>>
>> but can we just reuse the parameter of writeback_percent, rather than
>> introduce a new writeback_fragment_percent?
>>
>> That means the semantic of writeback_percent will act on dirty data
>> percent and dirty bucket percent.
>>
>> When we found there are dirty buckets more than (c->nbuckets *
>> writeback_percent), start the writeback.
>>
>>
>> Thanx
>>
>> Yang
>>
>>> rw_attribute(writeback_delay);
>>> rw_attribute(writeback_rate);
>>>
>>> @@ -197,6 +198,7 @@ SHOW(__bch_cached_dev)
>>> var_printf(writeback_running, "%i");
>>> var_print(writeback_delay);
>>> var_print(writeback_percent);
>>> + var_print(writeback_fragment_percent);
>>> sysfs_hprint(writeback_rate,
>>> wb ? atomic_long_read(&dc->writeback_rate.rate) << 9 : 0);
>>> sysfs_printf(io_errors, "%i", atomic_read(&dc->io_errors));
>>> @@ -308,6 +310,9 @@ STORE(__cached_dev)
>>> sysfs_strtoul_clamp(writeback_percent, dc->writeback_percent,
>>> 0, bch_cutoff_writeback);
>>>
>>> + sysfs_strtoul_clamp(writeback_fragment_percent, dc->writeback_fragment_percent,
>>> + 0, bch_cutoff_writeback_sync);
>>> +
>>> if (attr == &sysfs_writeback_rate) {
>>> ssize_t ret;
>>> long int v = atomic_long_read(&dc->writeback_rate.rate);
>>> @@ -498,6 +503,7 @@ static struct attribute *bch_cached_dev_files[] = {
>>> &sysfs_writeback_running,
>>> &sysfs_writeback_delay,
>>> &sysfs_writeback_percent,
>>> + &sysfs_writeback_fragment_percent,
>>> &sysfs_writeback_rate,
>>> &sysfs_writeback_rate_update_seconds,
>>> &sysfs_writeback_rate_i_term_inverse,
>>> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
>>> index 3c74996978da..34babc89fdf3 100644
>>> --- a/drivers/md/bcache/writeback.c
>>> +++ b/drivers/md/bcache/writeback.c
>>> @@ -88,6 +88,26 @@ static void __update_writeback_rate(struct cached_dev *dc)
>>> int64_t integral_scaled;
>>> uint32_t new_rate;
>>>
>>> + /*
>>> + * We need to consider the number of dirty buckets as well
>>> + * when calculating the proportional_scaled, Otherwise we might
>>> + * have an unreasonable small writeback rate at a highly fragmented situation
>>> + * when very few dirty sectors consumed a lot dirty buckets, the
>>> + * worst case is when dirty_data reached writeback_percent and
>>> + * dirty buckets reached to cutoff_writeback_sync, but the rate
>>> + * still will be at the minimum value, which will cause the write
>>> + * stuck at a non-writeback mode.
>>> + */
>>> + struct cache_set *c = dc->disk.c;
>>> +
>>> + if (c->gc_stats.in_use > dc->writeback_fragment_percent && dirty > 0) {
>>> + int64_t dirty_buckets = (c->gc_stats.in_use * c->nbuckets) / 100;
>>> + int64_t fragment = (dirty_buckets * c->cache->sb.bucket_size) / dirty;
>>> +
>>> + proportional_scaled =
>>> + div_s64(dirty, dc->writeback_rate_p_term_inverse) * (fragment);
>>> + }
>>> +
>>> if ((error < 0 && dc->writeback_rate_integral > 0) ||
>>> (error > 0 && time_before64(local_clock(),
>>> dc->writeback_rate.next + NSEC_PER_MSEC))) {
>>> @@ -969,6 +989,7 @@ void bch_cached_dev_writeback_init(struct cached_dev *dc)
>>> dc->writeback_metadata = true;
>>> dc->writeback_running = false;
>>> dc->writeback_percent = 10;
>>> + dc->writeback_fragment_percent = 50;
>>> dc->writeback_delay = 30;
>>> atomic_long_set(&dc->writeback_rate.rate, 1024);
>>> dc->writeback_rate_minimum = 8;
Powered by blists - more mailing lists