linux-kernel - Re: [PATCH] mm: fix page_mkclean_one (was: 2.6.19 file content corruption on ext3)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Sun, 24 Dec 2006 01:26:08 -0800 (PST)
From:	Linus Torvalds <torvalds@...l.org>
To:	Andrew Morton <akpm@...l.org>
cc:	Gordon Farquharson <gordonfarquharson@...il.com>,
	Martin Michlmayr <tbm@...ius.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Andrei Popa <andrei.popa@...eo.ro>,
	Hugh Dickins <hugh@...itas.com>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Arjan van de Ven <arjan@...radead.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm: fix page_mkclean_one (was: 2.6.19 file content
 corruption on ext3)

On Sun, 24 Dec 2006, Andrew Morton wrote:
> 
> > I now _suspect_ that we're talking about something like
> > 
> >  - we started a writeout. The IO is still pending, and the page was 
> >    marked clean and is now in the "writeback" phase.
> >  - a write happens to the page, and the page gets marked dirty again. 
> >    Marking the page dirty also marks all the _buffers_ in the page dirty, 
> >    but they were actually already dirty, because the IO hasn't completed 
> >    yet.
> >  - the IO from the _previous_ write completes, and marks the buffers clean 
> >    again.
> 
> Some things for the testers to try, please:
> 
> - mount the fs with ext2 with the no-buffer-head option.  That means either:

[ snip snip ]

This is definitely worth testing, but the exact schenario I outlined is 
probably not the thing that happens. It was really meant to be more of an 
exmple of the _kind_ of situation I think we might have.

That would explain why we didn't see this before: we simply didn't mark 
pages clean all that aggressively, and an app like rtorrent would normally 
have caused its flushes to happen _synchronously_ by using msync() (even 
if the IO itself was done asynchronously, all the dirty bit stuff would be 
synchronous wrt any rtorrent behaviour).

And the things that /did/ use to clean pages asynchronously (VM scanning) 
would always actually look at the "young" bit (aka "accessed") and not 
even touch the dirty bit if an application had accessed the page recently, 
so that basically avoided any likely races, because we'd touch the dirty 
bit ONLY if the page was "cold".

So this is why I'm saying that it might be an old bug, and it would be 
just the new pattern of handling dirty bits that triggers it.

But avoiding buffer heads and testing that part is worth doing. Just to 
remove one thing from the equation.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/