[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080519054554.GY103491721@sgi.com>
Date: Mon, 19 May 2008 15:45:54 +1000
From: David Chinner <dgc@....com>
To: Christoph Lameter <clameter@....com>
Cc: David Chinner <dgc@....com>, akpm@...ux-foundation.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
Mel Gorman <mel@...net.ie>, andi@...stfloor.org,
Rik van Riel <riel@...hat.com>,
Pekka Enberg <penberg@...helsinki.fi>, mpm@...enic.com
Subject: Re: [patch 10/21] buffer heads: Support slab defrag
On Fri, May 16, 2008 at 10:01:38AM -0700, Christoph Lameter wrote:
> On Fri, 16 May 2008, David Chinner wrote:
>
> > On Thu, May 15, 2008 at 10:42:15AM -0700, Christoph Lameter wrote:
> > > On Mon, 12 May 2008, David Chinner wrote:
> > >
> > > > If you are going to clean bufferheads (or pages), please clean entire
> > > > mappings via ->writepages as it leads to far superior I/O patterns
> > > > and a far higher aggregate rate of page cleaning.....
> > >
> > > That brings up another issue: Lets say I use writepages on a large file
> > > (couple of gig). How much do you want to write back?
> >
> > We're out of memory. I'd suggest write backing as much as you can
> > without blocking. e.g. treat it like pdflush and say 1024 pages, or
> > like balance_dirty_pages() and write a 'write_chunk' back from the
> > mapping (i.e. sync_writeback_pages()).
>
> Why are we out of memory?
Defragmentation is triggered as part of the usual memory reclaim
process. Which implies we've run out of free memory, correct?
> How do you trigger such a special writeout?
filemap_fdatawrite_range() perhaps?
> > Any of these are better from an I/O perspective than single page
> > writeback....
>
> But then filesystem can do tricks like writing out the surrounding areas
> as needed. The filesystem likely can estimate better how much writeout
> makes sense.
Pushing write-around into a method that is only supposed to write
the single page that is passed to it is a pretty bad abuse of the
API. Especially as we have many simple, ranged writeback methods
you could call. filemap_fdatawrite_range(), do_writepages(),
->writepages, etc.
FWIW, look at the mess of layering violations that write clustering
causes in XFS because we have to do this to keep allocation overhead
and fragmentation down to a minimum. It's a nasty hack to mitigate
the impact of the awful I/O patterns we see from the VM - suggesting
that all filesystems do this just so you don't have to call a
slightly smarter writeback primitive is insane....
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists