[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070912014917.GJ995458@sgi.com>
Date: Wed, 12 Sep 2007 11:49:17 +1000
From: David Chinner <dgc@....com>
To: Nick Piggin <nickpiggin@...oo.com.au>
Cc: Mel Gorman <mel@...net.ie>, Christoph Lameter <clameter@....com>,
andrea@...e.de, torvalds@...ux-foundation.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
Christoph Hellwig <hch@....de>,
William Lee Irwin III <wli@...omorphy.com>,
David Chinner <dgc@....com>,
Jens Axboe <jens.axboe@...cle.com>,
Badari Pulavarty <pbadari@...il.com>,
Maxim Levitsky <maximlevitsky@...il.com>,
Fengguang Wu <fengguang.wu@...il.com>,
swin wang <wangswin@...il.com>, totty.lu@...il.com,
hugh@...itas.com, joern@...ybastard.org
Subject: Re: [00/41] Large Blocksize Support V7 (adds memmap support)
On Tue, Sep 11, 2007 at 04:00:17PM +1000, Nick Piggin wrote:
> > > OTOH, I'm not sure how much buy-in there was from the filesystems guys.
> > > Particularly Christoph H and XFS (which is strange because they already
> > > do vmapping in places).
> >
> > I think they use vmapping because they have to, not because they want
> > to. They might be a lot happier with fsblock if it used contiguous pages
> > for large blocks whenever possible - I don't know for sure. The metadata
> > accessors they might be unhappy with because it's inconvenient but as
> > Christoph Hellwig pointed out at VM/FS, the filesystems who really care
> > will convert.
>
> Sure, they would rather not to. But there are also a lot of ways you can
> improve vmap more than what XFS does (or probably what darwin does)
> (more persistence for cached objects, and batched invalidates for example).
XFS already has persistence across the object life time (which can be many
tens of seconds for a frequently used buffer) and it also does batched
unmapping of objects as well.
> There are also a lot of trivial things you can do to make a lot of those
> accesses not require vmaps (and less trivial things, but even such things
> as binary searches over multiple pages should be quite possible with a bit
> of logic).
Yes, we already do the many of these things (via xfs_buf_offset()), but
that is not good enough for something like a memcpy that spans multiple
pages in a large block (think btree block compaction, splits and recombines).
IOWs, we already play these vmap harm-minimisation games in the places
where we can, but still the overhead is high and something we'd prefer
to be able to avoid.
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists