linux-kernel - Re: [PATCH 2/2] mm,migration: Fix race between shift_arg_pages and rmap_walk by guaranteeing rmap_walk finds PTEs created within the temporary stack

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20100510093241.420743a8.kamezawa.hiroyu@jp.fujitsu.com>
Date:	Mon, 10 May 2010 09:32:41 +0900
From:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To:	Mel Gorman <mel@....ul.ie>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux-MM <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Minchan Kim <minchan.kim@...il.com>,
	Christoph Lameter <cl@...ux.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Rik van Riel <riel@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 2/2] mm,migration: Fix race between shift_arg_pages and
 rmap_walk by guaranteeing rmap_walk finds PTEs created within the temporary
 stack

On Sun, 9 May 2010 20:21:45 +0100
Mel Gorman <mel@....ul.ie> wrote:

> On Thu, May 06, 2010 at 07:12:59PM -0700, Linus Torvalds wrote:
> > 
> > 
> > On Fri, 7 May 2010, KAMEZAWA Hiroyuki wrote:
> > > 
> > > IIUC, move_page_tables() may call "page table allocation" and it cannot be
> > > done under spinlock.
> > 
> > Bah. It only does a "alloc_new_pmd()", and we could easily move that out 
> > of the loop and pre-allocate the pmd's.
> > 
> > If that's the only reason, then it's a really weak one, methinks.
> > 
> 
> It turns out not to be easy to the preallocating of PUDs, PMDs and PTEs
> move_page_tables() needs.  To avoid overallocating, it has to follow the same
> logic as move_page_tables duplicating some code in the process. The ugliest
> aspect of all is passing those pre-allocated pages back into move_page_tables
> where they need to be passed down to such functions as __pte_alloc. It turns
> extremely messy.
> 
> I stopped working on it about half way through as it was already too ugly
> to live and would have similar cost to Kamezawa's much more straight-forward
> approach of using move_vma().
> 
> While using move_vma is straight-forward and solves the problem, it's
> not as cheap as Andrea's solution. Andrea allocates a temporary VMA and
> puts it on a list and very little else. It didn't show up any problems
> in microbenchmarks. Calling move_vma does a lot more work particularly in
> copy_vma and this slows down exec.
> 
> With Kamezawa's patch, kernbench was fine on wall time but in System Time,
> it slowed by up 1.48% in comparison to Andrea's slowing up by 0.64%[1].
> 
> aim9 was slowed as well. Kamezawa's slowed by 2.77% where Andrea's reported
> faster by 2.58%. While AIM9 is flaky and these figures are barely outside
> the noise, calling move_vma() is obviously more expensive.
> 

Thank you for testing.


> While my solution at http://lkml.org/lkml/2010/4/30/198 is cheapest as it
> does not touch exec() at all, is_vma_temporary_stack() could be broken in
> the future if any of the assumptions it makes change.
> 
> So what you have is an inverse relationship between magic and
> performance. Mine has the most magic and is fastest. Kamezawa's has the
> least magic but slowest and Andrea has the goldilocks factor. Which do
> you prefer?
> 

I like the fastest one ;)

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/