[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.00.1011180941450.3210@tigran.mtv.corp.google.com>
Date: Thu, 18 Nov 2010 10:00:13 -0800 (PST)
From: Hugh Dickins <hughd@...gle.com>
To: Christoph Hellwig <hch@...radead.org>
cc: Theodore Tso <tytso@....edu>, Nick Piggin <npiggin@...nel.dk>,
Peter Zijlstra <peterz@...radead.org>,
Michel Lespinasse <walken@...gle.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Rik van Riel <riel@...hat.com>,
Kosaki Motohiro <kosaki.motohiro@...fujitsu.com>,
Theodore Tso <tytso@...gle.com>,
Michael Rubin <mrubin@...gle.com>,
Suleiman Souhlal <suleiman@...gle.com>
Subject: Re: [PATCH 3/3] mlock: avoid dirtying pages and triggering
writeback
On Thu, 18 Nov 2010, Christoph Hellwig wrote:
> On Thu, Nov 18, 2010 at 05:43:06AM -0500, Theodore Tso wrote:
> > Why is it at all important that mlock() force block allocation for sparse blocks? It's not at all specified in the mlock() API definition that it does that.
> >
> > Are there really programs that assume that mlock() == fallocate()?!?
>
> If there are programs that do they can't predate linux 2.6.15, and only
> work on btrfs/ext4/xfs/etc, but not ext2/ext3/reiserfs. Seems rather
> unlikely to me.
Yes, almost. I'm very much on this side, that mlocking should not dirty
all those pages; but better admit one argument for the opposition - it's
possible that we'd find a case somewhere, which has always (i.e. even pre-
page_mkwrite) relied upon mlock of an entirely sparse file to result in
a nicely ordered allocation of blocks to the file (as would often have
happened with pdflush, I think), to give good sequential read patterns
ever after; but with this patch would now get much more random block
ordering, according to where the real writes actually fall.
It would be possible for a filesystem's ->fault(vma, &vmf) to observe
that it's being called on a VM_LOCKED|VM_SHARED vma, and make sure that
the page has backing in that case, to reproduce the old allocation behaviour
without all the unnecessary writing. But that would be extra work in every
filesystem that cares.
Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists