[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <45861E68.3060403@yahoo.com.au>
Date: Mon, 18 Dec 2006 15:51:52 +1100
From: Nick Piggin <nickpiggin@...oo.com.au>
To: Linus Torvalds <torvalds@...l.org>
CC: Andrew Morton <akpm@...l.org>, andrei.popa@...eo.ro,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Hugh Dickins <hugh@...itas.com>,
Florian Weimer <fw@...eb.enyo.de>,
Marc Haber <mh+linux-kernel@...schlus.de>,
Martin Michlmayr <tbm@...ius.com>
Subject: Re: 2.6.19 file content corruption on ext3
Linus Torvalds wrote:
> [ Replying to myself - a sure sign that I don't get out enough ]
>
> On Sun, 17 Dec 2006, Linus Torvalds wrote:
>
>>So I don't actually see any serialization at all that would keep a random
>>page from being paged back in.
>
>
> We do actually serialize, but we do it _after_ the page has already been
> mapped back. Ie we do it for the dirty page case at rthe end of
> do_wp_page() and do_no_page() when we do the "set_page_dirty_balance()",
> but that's potentially too late - we've already mapped the page read-write
> into the address space.
I can't see how that's exactly a problem -- so long as the page does not
get reclaimed (it won't, because we have a ref on it) then all that matters
is that the page eventually gets marked dirty.
> That said, this means that only threaded apps should ever trigger any
> problems, which would seem to make it unlikely that this is the issue.
>
> But Andrew: I don't think it's necessarily true that
> "try_to_free_buffers()" callers have all unmapped the page.
>
> That seems to be true for vmscan.c (ie the shrink_page_list ->
> try_to_release_page -> try_to_release_buffers callchain), but what about
> the other callchains (through filesystems, or through "pagevec_strip()" or
> similar?) That pagevec_strip() is called from shrink_active_list(), I
> don't see that unmapping the pages..
Right. But would it really matter whether they are currently mapped or
not, given that we agree it may become mapped at any point?
I think the problem Andrew identified is real.
The issue is the disconnect between the pte dirtiness and a filesystem
bringing buffers clean. But I disagree with his fix, because we don't
actually want to just throw out that pte dirtiness information: we're
just trying to get the PG_dirty bit into synch with what the buffers are
telling us, not actually clean or dirty anything, as such.
Can we clear the page dirty bit, then run set_page_dirty afterwards, if
any dirty ptes are found?
The other thing we might be able to do is to skip doing the
clear_page_dirty if the page is uptodate. This feels more hackish but it
might be faster?
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists