linux-kernel - Re: [PATCH] writeback: fix writeback cache thrashing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1357378914.8716.3.camel@kernel.cn.ibm.com>
Date:	Sat, 05 Jan 2013 03:41:54 -0600
From:	Simon Jeons <simon.jeons@...il.com>
To:	Fengguang Wu <fengguang.wu@...el.com>
Cc:	Namjae Jeon <linkinjeon@...il.com>, Jan Kara <jack@...e.cz>,
	Wanpeng Li <liwanp@...ux.vnet.ibm.com>,
	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org,
	Namjae Jeon <namjae.jeon@...sung.com>,
	Vivek Trivedi <t.vivek@...sung.com>,
	Dave Chinner <dchinner@...hat.com>
Subject: Re: [PATCH] writeback: fix writeback cache thrashing

On Sat, 2013-01-05 at 15:38 +0800, Fengguang Wu wrote:
> On Fri, Jan 04, 2013 at 11:26:43PM -0600, Simon Jeons wrote:
> > On Sat, 2013-01-05 at 11:26 +0800, Fengguang Wu wrote:
> > > > > > Hi Namjae,
> > > > > >
> > > > > > Why use bdi_stat_error here? What's the meaning of its comment "maximal
> > > > > > error of a stat counter"?
> > > > > Hi Simon,
> > > > > 
> > > > > As you know bdi stats (BDI_RECLAIMABLE, BDI_WRITEBACK …) are kept in
> > > > > percpu counters.
> > > > > When these percpu counters are incremented/decremented simultaneously
> > > > > on multiple CPUs by small amount (individual cpu counter less than
> > > > > threshold BDI_STAT_BATCH),
> > > > > it is possible that we get approximate value (not exact value) of
> > > > > these percpu counters.
> > > > > In order, to handle these percpu counter error we have used
> > > > > bdi_stat_error. bdi_stat_error is the maximum error which can happen
> > > > > in percpu bdi stats accounting.
> > > > > 
> > > > > bdi_stat(bdi, BDI_RECLAIMABLE);
> > > > >  -> This will give approximate value of BDI_RECLAIMABLE by reading
> > > > > previous value of percpu count.
> > > > > 
> > > > > bdi_stat_sum(bdi, BDI_RECLAIMABLE);
> > > > >  ->This will give exact value of BDI_RECLAIMABLE. It will take lock
> > > > > and add current percpu count of individual CPUs.
> > > > >    It is not recommended to use it frequently as it is expensive. We
> > > > > can better use “bdi_stat” and work with approx value of bdi stats.
> > > > > 
> > > > 
> > > > Hi Namjae, thanks for your clarify.
> > > > 
> > > > But why compare error stat count to bdi_bground_thresh? What's the
> > > 
> > > It's not comparing bdi_stat_error to bdi_bground_thresh, but rather,
> > > in concept, comparing bdi_stat (with error bound adjustments) to
> > > bdi_bground_thresh.
> > > 
> > > > relationship between them? I also see bdi_stat_error compare to
> > > > bdi_thresh/bdi_dirty in function balance_dirty_pages. 
> > > 
> > 
> > Hi Fengguang,
> > 
> > > Here, it's trying to use bdi_stat_sum(), the accurate (however more
> > > costly) version of bdi_stat(), if the error would possibly be large:
> > 
> > Why error is large use bdi_stat_sum and error is few use bdi_stat?
> 

Thanks for your response Fengguang! :)

> It's the opposite. Please check this per-cpu counter routine to get an idea:
> 
> /*
>  * Add up all the per-cpu counts, return the result.  This is a more accurate
>  * but much slower version of percpu_counter_read_positive()
>  */                                                 
> s64 __percpu_counter_sum(struct percpu_counter *fbc)
> 
> > > 
> > >                 if (bdi_thresh < 2 * bdi_stat_error(bdi)) {
> > >                         bdi_reclaimable = bdi_stat_sum(bdi, BDI_RECLAIMABLE);
> > >                         //...
> > >                 } else {
> > >                         bdi_reclaimable = bdi_stat(bdi, BDI_RECLAIMABLE);
> > >                         //...
> > >                 }
> > > 

The comment above these codes:

                 * In order to avoid the stacked BDI deadlock we need
                 * to ensure we accurately count the 'dirty' pages when
                 * the threshold is low.

Why your meaning threshold low is error large? 


> > > Here the comment should have explained it well:
> > > 
> > >                  * In theory 1 page is enough to keep the comsumer-producer
> > >                  * pipe going: the flusher cleans 1 page => the task dirties 1
> > >                  * more page. However bdi_dirty has accounting errors.  So use
> > 
> > Why bdi_dirty has accounting errors?
> 
> Because it typically uses bdi_stat() to get the rough sum of the per-cpu
> counters.
>  
> Thanks,
> Fengguang
> 
> > >                  * the larger and more IO friendly bdi_stat_error.
> > >                  */
> > >                 if (bdi_dirty <= bdi_stat_error(bdi))
> > >                         break;
> > > 
> > > 
> > > Thanks,
> > > Fengguang
> > 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/