[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20100413204149.D11C.A69D9226@jp.fujitsu.com>
Date: Tue, 13 Apr 2010 21:00:36 +0900 (JST)
From: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: kosaki.motohiro@...fujitsu.com, Rik van Riel <riel@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Borislav Petkov <bp@...en8.de>,
Johannes Weiner <hannes@...xchg.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Minchan Kim <minchan.kim@...il.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Lee Schermerhorn <Lee.Schermerhorn@...com>,
Nick Piggin <npiggin@...e.de>,
Andrea Arcangeli <aarcange@...hat.com>,
Hugh Dickins <hugh.dickins@...cali.co.uk>,
sgunderson@...foot.com
Subject: Re: [PATCH -v2] rmap: make anon_vma_prepare link in all the anon_vmas of a mergeable VMA
> On Tue, 2010-04-13 at 19:53 +0900, KOSAKI Motohiro wrote:
> > > struct anon_vma *page_lock_anon_vma(struct page *page)
> > > {
> > > @@ -294,14 +309,24 @@ struct anon_vma *page_lock_anon_vma(struct page *page)
> > > unsigned long anon_mapping;
> > >
> > > rcu_read_lock();
> > > - anon_mapping = (unsigned long) ACCESS_ONCE(page->mapping);
> > > + anon_mapping = (unsigned long)rcu_dereference(page->mapping);
> > > if ((anon_mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON)
> > > goto out;
> > > - if (!page_mapped(page))
> > > - goto out;
> > >
> > > anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON);
> > > spin_lock(&anon_vma->lock);
> >
> > Does anon->lock dereference is guranteed if page->_mapcount==-1?
> > It can be freed miliseconds ago, rcu_read_lock() doesn't provide such
> > gurantee.
> >
> > perhaps, I'm missing your point.
>
> No you're right, I got my head hopelessly twisted up trying to make
> page_lock_anon_vma() do something reliable, but there really isn't much
> that can be done.
>
> Luckily most users (with exception of the memory-failure.c one) don't
> really care and all take steps to verify the page is indeed in any of
> the vmas it might find.
>
> So I've given up on this and will only submit a patch like the below,
> which hopefully does still make sense...
>
> I do think there's a missing barrier in there as well, but I've made
> enough of a fool of myself.
>
> [ with the preemptible mmu_gather patches I introduce a refcount to
> the anon_vma, and then with atomic_inc_not_zero() we can add a
> guarantee that the returned anon_vma is alive ]
Indeed. refcount is best way. anon_vma DESTROY_BY_RCU stuff seems
overengineering, I think. this is fastest, but anon_vma allocation is not
(and was not) fork/exit bottleneck point. So, I guess most simply way is
best.
Also following patch looks good to me.
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Thanks for that. I've thought this is really necessary. but my (very) poor
english skill make hesitate it to me. sorry my laziness ;)
>
> ---
> mm/rmap.c | 18 ++++++++++++++++--
> 1 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index eaa7a09..49a2533 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -285,8 +285,22 @@ void __init anon_vma_init(void)
> }
>
> /*
> - * Getting a lock on a stable anon_vma from a page off the LRU is
> - * tricky: page_lock_anon_vma rely on RCU to guard against the races.
> + * Getting a lock on a stable anon_vma from a page off the LRU is tricky!
> + *
> + * Since there is no serialization what so ever against page_remove_rmap()
> + * the best this function can do is return a locked anon_vma that might
> + * have been relevant to this page.
> + *
> + * The page might have been remapped to a different anon_vma or the anon_vma
> + * returned may already be freed (and even reused).
> + *
> + * All users of this function must be very careful when walking the anon_vma
> + * chain and verify that the page in question is indeed mapped in it
> + * [ something equivalent to page_mapped_in_vma() ].
> + *
> + * Since anon_vma's slab is DESTROY_BY_RCU and we know from page_remove_rmap()
> + * that the anon_vma pointer from page->mapping is valid if there is a
> + * mapcount, we can dereference the anon_vma after observing those.
> */
> struct anon_vma *page_lock_anon_vma(struct page *page)
> {
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists