lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1166605296.10372.191.camel@twins>
Date:	Wed, 20 Dec 2006 10:01:36 +0100
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Linus Torvalds <torvalds@...l.org>
Cc:	Andrei Popa <andrei.popa@...eo.ro>, Andrew Morton <akpm@...l.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Hugh Dickins <hugh@...itas.com>,
	Florian Weimer <fw@...eb.enyo.de>,
	Marc Haber <mh+linux-kernel@...schlus.de>,
	Martin Michlmayr <tbm@...ius.com>,
	Martin Schwidefsky <schwidefsky@...ibm.com>,
	Heiko Carstens <heiko.carstens@...ibm.com>
Subject: Re: 2.6.19 file content corruption on ext3

On Tue, 2006-12-19 at 16:23 -0800, Linus Torvalds wrote:
> 
> On Wed, 20 Dec 2006, Peter Zijlstra wrote:
> > On Mon, 2006-12-18 at 12:14 -0800, Linus Torvalds wrote:
> > > OR:
> > > 
> > >  - page_mkclean_one() is simply buggy.
> > 
> > GOLD!
> 
> Ok. I was looking at that, and I wondered..
> 
> However, if that works, then I _think_ the correct sequence is the 
> following..
> 
> The rule should be:
>  - we flush the tlb _after_ we have cleared it, but _before_ we insert the 
>    new entry.
> 
> But I dunno. These things are damn subtle. Does this patch fix it for you?

I will try, but I had a look around the different architectures
implementation of ptep_clear_flush_dirty() and saw that not all do the
actual flush. So if we go down this road perhaps we should introduce
another per arch function that does the potential flush. like
flush_tlb_on_clear_dirty() or something like that.

Then we could write:

  entry = ptep_get_and_clear(mm, address, ptep)
  flush_tlb_on_clear_dirty(vma, address);
  entry = pte_mkclean(entry);
  entry = pte_wrprotect(entry);
  set_pte_at(mm, address, ptep, entry);

> I actually suspect we should do this as an arch-specific macro, and 
> totally replace the current "ptep_clear_flush_dirty()" with one that does 
> "ptep_clear_flush_dirty_and_set_wp()".
> 
> Because what I'd _really_ prefer to do on x86 (and probably on most other 
> sane architectures) is to do
> 
>  - atomically replace the pte with the EXACT SAME ONE, but one that 
>    has the writable bit clear.
> 
> 	bit_clear(_PAGE_BIT_RW, &(ptep)->pte_low);
> 
>  - flush the TLB, making sure that all CPU's will no longer write to it:
> 
> 	flush_tlb_page(vma, address);
> 
>  - finally, just fetch-and-clear the dirty bit (and since it's no longer 
>    writable, nobody should be settign it any more)
> 
> 	ret = bit_clear(__PAGE_BIT_DIRTY, &(ptep)->pte_low);
> 
> and now we should be all done.

Hmm, should we not flush after clearing the dirty bit? That is, why does
ptep_clear_flush_dirty() need a flush after clearing that bit? does it
leak through in the tlb copy?

Also, what is this page_test_and_clear_dirty() business, that seems to
be exclusively s390 btw. However they do seem to need this.

> But the "ptep_get_and_clear() + flush_tlb_page()" sequence should 
> hopefully also work.

Yeah, probably, not optimally so on some archs that don't actually need
the flush though. And as above, I wonder about s390.

(added our s390 friends to the CC list)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ