[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20100504131231.GF20979@csn.ul.ie>
Date: Tue, 4 May 2010 14:12:31 +0100
From: Mel Gorman <mel@....ul.ie>
To: Rik van Riel <riel@...hat.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
akpm@...ux-foundation.org, Linux-MM <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>,
Minchan Kim <minchan.kim@...il.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Christoph Lameter <cl@...ux.com>
Subject: Re: [PATCH 1/2] mm: Take all anon_vma locks in anon_vma_lock
On Mon, May 03, 2010 at 01:58:36PM -0400, Rik van Riel wrote:
>>
>> Btw, Mel's patch doesn't really match the description of 2/2. 2/2 says
>> that all pages must always be findable in rmap. Mel's patch seems to
>> explicitly say "we want to ignore that thing that is busy for execve". Are
>> we just avoiding a BUG_ON()? Is perhaps the BUG_ON() buggy?
>
> I have no good answer to this question.
>
> Mel? Andrea?
>
The wording could have been better.
The problem is that once a migration PTE is established, it is expected that
rmap can find it. In the specific case of exec, this can fail because of
how the temporary stack is moved. As migration colliding with exec is rare,
the approach taken by the patch was to not create migration PTEs that rmap
could not find. On the plus side, exec (the common case) is unaffected. On
the negative side, it's avoiding the exec vs migration problem instead of
fixing it.
The BUG_ON is not a buggy check. While migration is taking place, the page lock
is held and not unreleased until all the migration PTEs have been removed. If
a migration entry exists and the page is unlocked, it means that rmap failed
to find all the entries. If the BUG_ON was not made, do_swap_page() would
either end up looking up a semi-random entry in swap cache and inserting it
(memory corruption), inserting a random page from swap (memory corruption)
or returning VM_FAULT_OOM to the fault handler (general carnage).
It was considered to lazily clean up the migration PTEs
(http://lkml.org/lkml/2010/4/27/458) but there is no guarantee that the page
the migration PTE pointed to is still the correct one. If it had been freed
and re-used, the results would probably be memory corruption.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists