[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <45950561.6040508@yahoo.com.au>
Date: Fri, 29 Dec 2006 23:09:05 +1100
From: Nick Piggin <nickpiggin@...oo.com.au>
To: Linus Torvalds <torvalds@...l.org>
CC: Segher Boessenkool <segher@...nel.crashing.org>,
David Miller <davem@...emloft.net>, kenneth.w.chen@...el.com,
guichaz@...oo.fr, hugh@...itas.com,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
ranma@...edrich.de, gordonfarquharson@...il.com,
Andrew Morton <akpm@...l.org>, a.p.zijlstra@...llo.nl,
tbm@...ius.com, arjan@...radead.org, andrei.popa@...eo.ro
Subject: Re: Ok, explained.. (was Re: [PATCH] mm: fix page_mkclean_one)
Hey nice work Linus!
Linus Torvalds wrote:
>
> On Fri, 29 Dec 2006, Linus Torvalds wrote:
>
>>Hmm? I'd love it if somebody else wrote the patch and tested it, because
>>I'm getting sick and tired of this bug ;)
>
>
> Who the hell am I kidding? I haven't been able to sleep right for the last
> few days over this bug. It was really getting to me.
>
> And putting on the thinking cap, there's actually a fairly simple an
> nonintrusive patch.
Yeah *this* makes more sense. And in retrospect it was simple, we
can't just throw out pte dirtiness information if the page doesn't
have all buffers dirtied.
> It still has a tiny tiny race (see the comment), but I
> bet nobody can really hit it in real life anyway, and I know several ways
> to fix it, so I'm not really _that_ worried about it.
Well the race isn't a data loss one, is it? Just a case where the
pte may be dirty but the page dirty state not accounted for.
Can we fix it by just putting the page_mkclean back inside the
TestClearPageDirty check, and re-clearing PG_dirty after redoing
the set_page_dirty?
>
> The patch is mostly a comment. The "real" meat of it is actually just a
> few lines.
>
> Can anybody get corruption with this thing applied? It goes on top of
> plain v2.6.20-rc2.
>
> Linus
>
> ----
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index b3a198c..ec01da1 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -862,17 +862,46 @@ int clear_page_dirty_for_io(struct page *page)
> {
> struct address_space *mapping = page_mapping(page);
>
> - if (!mapping)
> - return TestClearPageDirty(page);
> -
> - if (TestClearPageDirty(page)) {
> - if (mapping_cap_account_dirty(mapping)) {
> - page_mkclean(page);
> + if (mapping && mapping_cap_account_dirty(mapping)) {
> + /*
> + * Yes, Virginia, this is indeed insane.
> + *
> + * We use this sequence to make sure that
> + * (a) we account for dirty stats properly
> + * (b) we tell the low-level filesystem to
> + * mark the whole page dirty if it was
> + * dirty in a pagetable. Only to then
> + * (c) clean the page again and return 1 to
> + * cause the writeback.
> + *
> + * This way we avoid all nasty races with the
> + * dirty bit in multiple places and clearing
> + * them concurrently from different threads.
> + *
> + * Note! Normally the "set_page_dirty(page)"
> + * has no effect on the actual dirty bit - since
> + * that will already usually be set. But we
> + * need the side effects, and it can help us
> + * avoid races.
> + *
> + * We basically use the page "master dirty bit"
> + * as a serialization point for all the different
> + * threds doing their things.
> + *
> + * FIXME! We still have a race here: if somebody
> + * adds the page back to the page tables in
> + * between the "page_mkclean()" and the "TestClearPageDirty()",
> + * we might have it mapped without the dirty bit set.
> + */
> + if (page_mkclean(page))
> + set_page_dirty(page);
> + if (TestClearPageDirty(page)) {
> dec_zone_page_state(page, NR_FILE_DIRTY);
> + return 1;
> }
> - return 1;
> + return 0;
> }
> - return 0;
> + return TestClearPageDirty(page);
> }
> EXPORT_SYMBOL(clear_page_dirty_for_io);
>
>
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists