lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 19 Dec 2006 16:03:49 -0800 (PST)
From:	Linus Torvalds <torvalds@...l.org>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
cc:	Andrew Morton <akpm@...l.org>,
	Nick Piggin <nickpiggin@...oo.com.au>, andrei.popa@...eo.ro,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Hugh Dickins <hugh@...itas.com>,
	Florian Weimer <fw@...eb.enyo.de>,
	Marc Haber <mh+linux-kernel@...schlus.de>,
	Martin Michlmayr <tbm@...ius.com>
Subject: Re: 2.6.19 file content corruption on ext3



On Wed, 20 Dec 2006, Peter Zijlstra wrote:

> On Tue, 2006-12-19 at 14:58 -0800, Andrew Morton wrote:
> 
> > Well... we'd need to see (corruption && this-not-triggering) to be sure.
> > 
> > Peter, have you been able to trigger the corruption?
> 
> Yes; however the mail I send describing that seems to be lost in space.

Btw, can somebody actually explain the mess that is ext3 "dirtying".

Ext3 does NOT use __set_page_dirty_buffers. It does

	static int ext3_journalled_set_page_dirty(struct page *page)
	{
	        SetPageChecked(page);
	        return __set_page_dirty_nobuffers(page);
	}

and uses that "Checked" bit as a "whole page is dirty" bit (which it tests 
in "writepage()".

You realize what this all means? It means that ANYTHING that actually 
clears the _real_ dirty bit won't actually be doing anything at all for 
ext3, because the Checked bit will still stay set, and any IO down the 
line on that page would totally ignore the dirty bits on the buffer heads 
and just write out everything.

That is "The Mess(tm)".

It also basically means that anything that clears the dirty bit without 
just calling "writepage()" had _better_ call "invalidatepage()" for the 
whole page, because otherwise the PageChecked bit will never be cleared as 
far as I can see. Happily, at least ext3 seems to _test_ for that case in 
the release_page() function, so it appears that we do do this.

But this seems to just strengthen my argument: you can NEVER clean a page, 
unless you (a) do IO on it immediately afterwards (writeback) or (b) 
invalidate it entirely (truncate).

I'd really like to see just those two functions exist. Preferably in a 
form where you can see easily that we actually follow those rules. Rather 
than having a confusing set of "clear_page_dirty()" and
"test_and_clear_page_dirty()" functions that are called from random 
places.

IOW, I think the "clear_page_dirty_for_io()" is fine (it's case (a)) 
above, and then we should probably have a "cancel_dirty_page()" function 
that does all the current clear_page_dirty() but also makes sure that we 
actually call the invalidate_page() function itself. 

Hmm?

			Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ