lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 29 Dec 2006 13:16:46 +0200
From:	Andrei Popa <andrei.popa@...eo.ro>
To:	Linus Torvalds <torvalds@...l.org>
Cc:	Segher Boessenkool <segher@...nel.crashing.org>,
	David Miller <davem@...emloft.net>, nickpiggin@...oo.com.au,
	kenneth.w.chen@...el.com, guichaz@...oo.fr, hugh@...itas.com,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	ranma@...edrich.de, gordonfarquharson@...il.com,
	Andrew Morton <akpm@...l.org>, a.p.zijlstra@...llo.nl,
	tbm@...ius.com, arjan@...radead.org
Subject: Re: Ok, explained.. (was Re: [PATCH] mm: fix page_mkclean_one)

On Fri, 2006-12-29 at 02:48 -0800, Linus Torvalds wrote:
> 
> On Fri, 29 Dec 2006, Linus Torvalds wrote:
> > 
> > Hmm? I'd love it if somebody else wrote the patch and tested it, because 
> > I'm getting sick and tired of this bug ;)
> 
> Who the hell am I kidding? I haven't been able to sleep right for the last 
> few days over this bug. It was really getting to me.
> 
> And putting on the thinking cap, there's actually a fairly simple an 
> nonintrusive patch. It still has a tiny tiny race (see the comment), but I 
> bet nobody can really hit it in real life anyway, and I know several ways 
> to fix it, so I'm not really _that_ worried about it.
> 
> The patch is mostly a comment. The "real" meat of it is actually just a 
> few lines.
> 
> Can anybody get corruption with this thing applied? It goes on top of 
> plain v2.6.20-rc2.

Tested with rtorrent and there is no corruption.


> 
> 		Linus
> 
> ----
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index b3a198c..ec01da1 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -862,17 +862,46 @@ int clear_page_dirty_for_io(struct page *page)
>  {
>  	struct address_space *mapping = page_mapping(page);
>  
> -	if (!mapping)
> -		return TestClearPageDirty(page);
> -
> -	if (TestClearPageDirty(page)) {
> -		if (mapping_cap_account_dirty(mapping)) {
> -			page_mkclean(page);
> +	if (mapping && mapping_cap_account_dirty(mapping)) {
> +		/*
> +		 * Yes, Virginia, this is indeed insane.
> +		 *
> +		 * We use this sequence to make sure that
> +		 *  (a) we account for dirty stats properly
> +		 *  (b) we tell the low-level filesystem to
> +		 *      mark the whole page dirty if it was
> +		 *      dirty in a pagetable. Only to then
> +		 *  (c) clean the page again and return 1 to
> +		 *      cause the writeback.
> +		 *
> +		 * This way we avoid all nasty races with the
> +		 * dirty bit in multiple places and clearing
> +		 * them concurrently from different threads.
> +		 *
> +		 * Note! Normally the "set_page_dirty(page)"
> +		 * has no effect on the actual dirty bit - since
> +		 * that will already usually be set. But we
> +		 * need the side effects, and it can help us
> +		 * avoid races.
> +		 *
> +		 * We basically use the page "master dirty bit"
> +		 * as a serialization point for all the different
> +		 * threds doing their things.
> +		 *
> +		 * FIXME! We still have a race here: if somebody
> +		 * adds the page back to the page tables in
> +		 * between the "page_mkclean()" and the "TestClearPageDirty()",
> +		 * we might have it mapped without the dirty bit set.
> +		 */
> +		if (page_mkclean(page))
> +			set_page_dirty(page);
> +		if (TestClearPageDirty(page)) {
>  			dec_zone_page_state(page, NR_FILE_DIRTY);
> +			return 1;
>  		}
> -		return 1;
> +		return 0;
>  	}
> -	return 0;
> +	return TestClearPageDirty(page);
>  }
>  EXPORT_SYMBOL(clear_page_dirty_for_io);
>  

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ