lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 10 May 2010 09:32:41 +0900
From:	KAMEZAWA Hiroyuki <>
To:	Mel Gorman <>
Cc:	Linus Torvalds <>,
	Andrew Morton <>,
	Linux-MM <>,
	LKML <>,
	Minchan Kim <>,
	Christoph Lameter <>,
	Andrea Arcangeli <>,
	Rik van Riel <>,
	Peter Zijlstra <>
Subject: Re: [PATCH 2/2] mm,migration: Fix race between shift_arg_pages and
 rmap_walk by guaranteeing rmap_walk finds PTEs created within the temporary

On Sun, 9 May 2010 20:21:45 +0100
Mel Gorman <> wrote:

> On Thu, May 06, 2010 at 07:12:59PM -0700, Linus Torvalds wrote:
> > 
> > 
> > On Fri, 7 May 2010, KAMEZAWA Hiroyuki wrote:
> > > 
> > > IIUC, move_page_tables() may call "page table allocation" and it cannot be
> > > done under spinlock.
> > 
> > Bah. It only does a "alloc_new_pmd()", and we could easily move that out 
> > of the loop and pre-allocate the pmd's.
> > 
> > If that's the only reason, then it's a really weak one, methinks.
> > 
> It turns out not to be easy to the preallocating of PUDs, PMDs and PTEs
> move_page_tables() needs.  To avoid overallocating, it has to follow the same
> logic as move_page_tables duplicating some code in the process. The ugliest
> aspect of all is passing those pre-allocated pages back into move_page_tables
> where they need to be passed down to such functions as __pte_alloc. It turns
> extremely messy.
> I stopped working on it about half way through as it was already too ugly
> to live and would have similar cost to Kamezawa's much more straight-forward
> approach of using move_vma().
> While using move_vma is straight-forward and solves the problem, it's
> not as cheap as Andrea's solution. Andrea allocates a temporary VMA and
> puts it on a list and very little else. It didn't show up any problems
> in microbenchmarks. Calling move_vma does a lot more work particularly in
> copy_vma and this slows down exec.
> With Kamezawa's patch, kernbench was fine on wall time but in System Time,
> it slowed by up 1.48% in comparison to Andrea's slowing up by 0.64%[1].
> aim9 was slowed as well. Kamezawa's slowed by 2.77% where Andrea's reported
> faster by 2.58%. While AIM9 is flaky and these figures are barely outside
> the noise, calling move_vma() is obviously more expensive.

Thank you for testing.

> While my solution at is cheapest as it
> does not touch exec() at all, is_vma_temporary_stack() could be broken in
> the future if any of the assumptions it makes change.
> So what you have is an inverse relationship between magic and
> performance. Mine has the most magic and is fastest. Kamezawa's has the
> least magic but slowest and Andrea has the goldilocks factor. Which do
> you prefer?

I like the fastest one ;)


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

Powered by blists - more mailing lists