lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1271083207.4807.18.camel@twins>
Date:	Mon, 12 Apr 2010 16:40:07 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Borislav Petkov <bp@...en8.de>,
	Johannes Weiner <hannes@...xchg.org>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Rik van Riel <riel@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Minchan Kim <minchan.kim@...il.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Lee Schermerhorn <Lee.Schermerhorn@...com>,
	Nick Piggin <npiggin@...e.de>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Hugh Dickins <hugh.dickins@...cali.co.uk>,
	sgunderson@...foot.com
Subject: Re: [PATCH -v2] rmap: make anon_vma_prepare link in all the
 anon_vmas of a mergeable VMA

On Sat, 2010-04-10 at 11:21 -0700, Linus Torvalds wrote:
> 

> Ho humm.
> 
> Maybe I'm crazy, but something started bothering me. And I started 
> wondering: when is the 'page->mapping' of an anonymous page actually 
> cleared?
> 
> The thing is, the mapping of an anonymous page is actually cleared only 
> when the page is _freed_, in "free_hot_cold_page()". 
> 
> Now, let's think about that. And in particular, let's think about how that 
> relates to the freeing of the 'anon_vma' that the page->mapping points to. 
> 
> The way the anon_vma is freed is when the mapping is torn down, and we do 
> roughly:
> 
> 	tlb = tlb_gather_mmu(mm,..)
> 	..
> 	unmap_vmas(&tlb, vma ..
> 	..
> 	free_pgtables()
> 	..
> 	tlb_finish_mmu(tlb, start, end);
> 
> and we actually unmap all the pages in "unmap_vmas()", and then _after_ 
> unmapping all the pages we do the "unlink_anon_vmas(vma);" in 
> "free_pgtables()". Fine so far - the anon_vma stay around until after the 
> page has been happily unmapped.
> 
> But "unmapped all the pages" is _not_ actually the same as "free'd all the 
> pages". The actual _freeing_ of the page happens generally in 
> tlb_finish_mmu(), because we can free the page only after we've flushed 
> any TLB entries.
> 
> So what we have in that tlb_gather structure is a list of _pending_ pages 
> to be freed, while we already actually free'd the anon_vmas earlier!
> 
> Now, the thing is, tlb_gather_mmu() begins a preempt-safe region (because 
> we use a per-cpu variable), but as far as I can tell it is _not_ an 
> RCU-safe region.
> 
> So I think we might actually get a real RCU freeing event while this all
> happens. So now the 'anon_vma' that 'page->mapping' points to has not just 
> been released back to the SLUB caches, the page itself might have been 
> released too.
> 
> I dunno. Does the above sound at all sane? Or am I just raving?
> 
> Something hacky like the above might fix it if I'm not just raving. I 
> really might be missing something here.

Right, so unless you have CONFIG_TREE_PREEMPT_RCU=y, the preempt-disable
== RCU read lock assumption does hold.

But even with your patch it doesn't close all holes because while
zap_pte_range() can remove the last mapcount of the page, the
page_remove_tlb() et al. don't need to be the last use count of the
page.

Concurrent reclaim/gup/whatever could still have a count out on the page
delaying the actual free beyond the tlb gather RCU section.

So the reason page->mapping isn't cleared in page_remove_rmap() isn't
detailed beyond a (possible) race with page_add_anon_rmap() (which I
guess would be reclaim trying to unmap the page and a fault re-instating
it).

This also complicates the whole page_lock_anon_vma() thing, so it would
be nice to be able to remove this race and clear page->mapping in
page_remove_rmap().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ