lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 18 Dec 2006 10:03:45 -0800 (PST)
From:	Linus Torvalds <torvalds@...l.org>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
cc:	Andrew Morton <akpm@...l.org>, andrei.popa@...eo.ro,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Hugh Dickins <hugh@...itas.com>,
	Florian Weimer <fw@...eb.enyo.de>,
	Marc Haber <mh+linux-kernel@...schlus.de>,
	Martin Michlmayr <tbm@...ius.com>
Subject: Re: 2.6.19 file content corruption on ext3


Andrei,
 could you try Peter's patch (on top of Andrew's patch - it depends on 
it, and wouldn't work on an unmodified -git kernel, but add the WARN_ON() 
I mention in this email? You seem to be able to reproduce this easily.. 
Thanks)

On Mon, 18 Dec 2006, Peter Zijlstra wrote:
> 
> This should be safe; page_mkclean walks the rmap and flips the pte's
> under the pte lock and records the dirty state while iterating.
> Concurrent faults will either do set_page_dirty() before we get around
> to doing it or vice versa, but dirty state is not lost.

Ok, I really liked this patch, but the more I thought about it, the more I 
started to doubt the reasons for liking it.

I think we have some core fundamental problem here that this patch is 
needed at all.

So let's think about this: we apparently have two cases of 
"clear_page_dirty()":

 - the one that really wants to clear the bit unconditionally (Andrew 
   calls this the "must_clean_ptes" case, which I personally find to be a 
   really confusing name, but whatever)

 - the other case. The case that doesn't want to really clear the pte 
   dirty bits.

and I thought your patch made sense, because it saved away the pte state 
in the page dirty state, and that matches my mental model, but the more I 
think about it, the less sense that whole "the other case" situation makes 
AT ALL.

Why does "the other case" exist at all? If you want to clear the dirty 
page flag, what is _ever_ the reason for not wanting to drop PTE dirty 
information? In other words, what possible reason can there ever be for 
saying "I want this page to be clean", while at the same time saying "but 
if it was dirty in the page tables, don't forget about that state".

So I absolutely detested Andrew's original patch, because that one made 
zero sense at all even from a code standpoint. With your patch on top, it 
all suddenly makes sense: at least you don't just leave dirty pages in the 
PTE's with a "struct page" that is marked clean, and the end result is 
undeniably at least _consistent_.

So Andrew's patch I can't stand, because the whole point of it seems to be 
to leave the system in an inconsistent state (dirty in the pte's but 
marked "clean"), and if we want to have that state, then we should just 
revert _everything_ to the 2.6.18 situation, and not play these games at 
all.

Andrew's patch with your patch on top makes me happy, because now we're 
at least honoring all the basic rules (we don't get into an inconsistent 
state), so on a local level it all makes sense. HOWEVER, I then don't 
actually understand how it could ever actually make sense to ask for 
"please clean the page, but don't actually clean it".

So _I_ think that we should add a honking huge WARN_ON() for this case. 
Ie, do your patch, but instead of re-dirtying the page:

+                       if (!must_clean_ptes && cleaned)
+                               set_page_dirty(page);

we would do

+                       if (!must_clean_ptes && cleaned) {
+                               WARN_ON(1);
+                               set_page_dirty(page);
+                       }

and ask the people who see this problem to see if they get the WARN_ON() 
(assuming it _fixes_ their data corruption).

Because whoever calls "clean_dirty_page()" without actually wanting to 
clean the PTE's is really a bug: those dirty PTE's had better not exist.

Or maybe the WARN_ON() just points out _why_ somebody would want to do 
something this insane. Right now I just can't see why it's a valid thing 
to do.

Maybe I'm still confused. 

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ