[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101214152633.GA26717@localhost>
Date: Tue, 14 Dec 2010 23:26:33 +0800
From: Wu Fengguang <fengguang.wu@...el.com>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Richard Kennedy <richard@....demon.co.uk>,
Andrew Morton <akpm@...ux-foundation.org>,
Jan Kara <jack@...e.cz>, Christoph Hellwig <hch@....de>,
Trond Myklebust <Trond.Myklebust@...app.com>,
Dave Chinner <david@...morbit.com>,
Theodore Ts'o <tytso@....edu>,
Chris Mason <chris.mason@...cle.com>,
Mel Gorman <mel@....ul.ie>, Rik van Riel <riel@...hat.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Greg Thelen <gthelen@...gle.com>,
Minchan Kim <minchan.kim@...il.com>,
linux-mm <linux-mm@...ck.org>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 04/35] writeback: reduce per-bdi dirty threshold ramp
up time
On Tue, Dec 14, 2010 at 11:15:07PM +0800, Wu Fengguang wrote:
> On Tue, Dec 14, 2010 at 10:50:55PM +0800, Peter Zijlstra wrote:
> > On Tue, 2010-12-14 at 22:39 +0800, Wu Fengguang wrote:
> > > On Tue, Dec 14, 2010 at 10:33:25PM +0800, Wu Fengguang wrote:
> > > > On Tue, Dec 14, 2010 at 09:59:10PM +0800, Wu Fengguang wrote:
> > > > > On Tue, Dec 14, 2010 at 09:37:34PM +0800, Richard Kennedy wrote:
> > > >
> > > > > > As to the ramp up time, when writing to 2 disks at the same time I see
> > > > > > the per_bdi_threshold taking up to 20 seconds to converge on a steady
> > > > > > value after one of the write stops. So I think this could be speeded up
> > > > > > even more, at least on my setup.
> > > > >
> > > > > I have the roughly same ramp up time on the 1-disk 3GB mem test:
> > > > >
> > > > > http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/3G/ext4-1dd-1M-8p-2952M-2.6.37-rc5+-2010-12-09-00-37/dirty-pages.png
> > > > >
> > > >
> > > > Interestingly, the above graph shows that after about 10s fast ramp
> > > > up, there is another 20s slow ramp down. It's obviously due the
> > > > decline of global limit:
> > > >
> > > > http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/3G/ext4-1dd-1M-8p-2952M-2.6.37-rc5+-2010-12-09-00-37/vmstat-dirty.png
> > > >
> > > > But why is the global limit declining? The following log shows that
> > > > nr_file_pages keeps growing and goes stable after 75 seconds (so long
> > > > time!). In the same period nr_free_pages goes slowly down to its
> > > > stable value. Given that the global limit is mainly derived from
> > > > nr_free_pages+nr_file_pages (I disabled swap), something must be
> > > > slowly eating memory until 75 ms. Maybe the tracing ring buffers?
> > > >
> > > > free file reclaimable pages
> > > > 50s 369324 + 318760 => 688084
> > > > 60s 235989 + 448096 => 684085
> > > >
> > > > http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/3G/ext4-1dd-1M-8p-2952M-2.6.37-rc5+-2010-12-09-00-37/vmstat
> > >
> > > The log shows that ~64MB reclaimable memory is stoled. But the trace
> > > data only takes 1.8MB. Hmm..
> >
> > Also, trace buffers are fully pre-allocated.
> >
> > Inodes perhaps?
>
> Just figured out that it's the buffer heads :)
>
> The other interesting question is, why it takes up to 50s to consume
> all the nr_free_pages pages. I would imagine the free pages be quickly
> allocated to the page cache..
>
> Attached is the graph for ext2-1dd-1M-8p-2952M-2.6.37-rc5+-2010-12-09-01-36
Ah it's embarrassing.. we are writing data and the free memory
consumption is quickly bounded by the disk write speed..
So it's FS independent.
Here is the graph for ext3 on vanilla kernel, generated from
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/3G/ext3-1dd-1M-8p-2952M-2.6.37-rc5-2010-12-10-19-57/vmstat
And btrfs on vanilla kernel
http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/3G/btrfs-1dd-1M-8p-2952M-2.6.37-rc5-2010-12-10-21-23/vmstat
Thanks,
Fengguang
Download attachment "vmstat-reclaimable-500.png" of type "image/png" (68089 bytes)
Download attachment "vmstat-dirty-500.png" of type "image/png" (57116 bytes)
Powered by blists - more mailing lists