lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 17 Jun 2011 20:32:37 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Hugh Dickins <hughd@...gle.com>,
	Tim Chen <tim.c.chen@...ux.intel.com>,
	Andi Kleen <ak@...ux.intel.com>,
	Shaohua Li <shaohua.li@...el.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	David Miller <davem@...emloft.net>,
	Martin Schwidefsky <schwidefsky@...ibm.com>,
	Russell King <rmk@....linux.org.uk>,
	Paul Mundt <lethal@...ux-sh.org>,
	Jeff Dike <jdike@...toit.com>,
	Richard Weinberger <richard@....at>,
	"Luck, Tony" <tony.luck@...el.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Mel Gorman <mel@....ul.ie>, Nick Piggin <npiggin@...nel.dk>,
	Namhyung Kim <namhyung@...il.com>,
	"Shi, Alex" <alex.shi@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"Rafael J. Wysocki" <rjw@...k.pl>
Subject: Re: REGRESSION: Performance regressions from switching
 anon_vma->lock to mutex

On Fri, 2011-06-17 at 20:18 +0200, Peter Zijlstra wrote:
> On Fri, 2011-06-17 at 11:01 -0700, Linus Torvalds wrote:
> 
> > So I do think that "page_referenced_anon()" should do a trylock, and
> > return "referenced" if the trylock fails. Comments?
> 
> The only problem I can immediately see with that is when a single
> process' anon memory is dominant, then such an allocation will never
> succeed in freeing these pages because the one lock will pin pretty much
> all anon. Then again, there's always a few file pages to drop.
> 
> That said, its rather unlikely, and iirc people were working on removing
> direct reclaim, or at least rely less on it.
> 
> 

something like so I guess, completely untested etc..

Also, there's a page_lock_anon_vma() user in split_huge_page(), which is
used in mm/swap_state.c:add_to_swap(), which is also in the reclaim
path, not quite sure what to do there.

Aside from the THP thing there's a user in memory-failure.c, which looks
to be broken as it is because its calling things under tasklist_lock
which isn't preemptible, but it looks like we can simply swap the
tasklist_lock vs page_lock_anon_vma.

---
 kernel/Makefile |    1 +
 mm/rmap.c       |   39 +++++++++++++++++++++++++++++++++++++--
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/kernel/Makefile b/kernel/Makefile
index 2d64cfc..f6d05de 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -101,6 +101,7 @@ obj-$(CONFIG_RING_BUFFER) += trace/
 obj-$(CONFIG_TRACEPOINTS) += trace/
 obj-$(CONFIG_SMP) += sched_cpupri.o
 obj-$(CONFIG_IRQ_WORK) += irq_work.o
+obj-m += test.o
 
 obj-$(CONFIG_PERF_EVENTS) += events/
 
diff --git a/mm/rmap.c b/mm/rmap.c
index 0eb463e..40cd399 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -400,6 +400,41 @@ out:
 	return anon_vma;
 }
 
+struct anon_vma *page_trylock_anon_vma(struct page *page)
+{
+	struct anon_vma *anon_vma = NULL;
+	struct anon_vma *root_anon_vma;
+	unsigned long anon_mapping;
+
+	rcu_read_lock();
+	anon_mapping = (unsigned long) ACCESS_ONCE(page->mapping);
+	if ((anon_mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON)
+		goto out;
+	if (!page_mapped(page))
+		goto out;
+
+	anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON);
+	root_anon_vma = ACCESS_ONCE(anon_vma->root);
+	if (!mutex_trylock(&root_anon_vma->mutex)) {
+		anon_vma = NULL;
+		goto out;
+	}
+
+	/*
+	 * If the page is still mapped, then this anon_vma is still
+	 * its anon_vma, and holding the mutex ensures that it will
+	 * not go away, see anon_vma_free().
+	 */
+	if (!page_mapped(page)) {
+		mutex_unlock(&root_anon_vma->mutex);
+		anon_vma = NULL;
+	}
+
+out:
+	rcu_read_unlock();
+	return anon_vma;
+}
+
 /*
  * Similar to page_get_anon_vma() except it locks the anon_vma.
  *
@@ -694,7 +729,7 @@ static int page_referenced_anon(struct page *page,
 	struct anon_vma_chain *avc;
 	int referenced = 0;
 
-	anon_vma = page_lock_anon_vma(page);
+	anon_vma = page_trylock_anon_vma(page);
 	if (!anon_vma)
 		return referenced;
 
@@ -1396,7 +1431,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
 	struct anon_vma_chain *avc;
 	int ret = SWAP_AGAIN;
 
-	anon_vma = page_lock_anon_vma(page);
+	anon_vma = page_trylock_anon_vma(page);
 	if (!anon_vma)
 		return ret;
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ