lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1313740111-27446-10-git-send-email-walken@google.com>
Date:	Fri, 19 Aug 2011 00:48:31 -0700
From:	Michel Lespinasse <walken@...gle.com>
To:	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Cc:	Andrea Arcangeli <aarcange@...hat.com>,
	Hugh Dickins <hughd@...gle.com>,
	Minchan Kim <minchan.kim@...il.com>,
	Johannes Weiner <jweiner@...hat.com>,
	Rik van Riel <riel@...hat.com>, Mel Gorman <mgorman@...e.de>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Shaohua Li <shaohua.li@...el.com>
Subject: [PATCH 9/9] mm: make sure tail page counts are stable before splitting THP pages

As described in the page_cache_get_speculative() comment
in pagemap.h, the count of all pages coming out of the allocator
must be considered unstable unless an RCU grace period has passed
since the pages were allocated.

This is an issue for THP because __split_huge_page_refcount()
depends on tail page counts being stable.

By setting a cookie on THP pages when they are allocated, we are able
to ensure the tail page counts are stable before splitting such pages.
In the typical case, the THP page should be old enough by the time we
try to split it, so that we won't have to wait.

Signed-off-by: Michel Lespinasse <walken@...gle.com>
---
 mm/huge_memory.c |   33 +++++++++++++++++++++++++++++----
 1 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 81532f2..46c0c0b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -657,15 +657,23 @@ static inline struct page *alloc_hugepage_vma(int defrag,
 					      unsigned long haddr, int nd,
 					      gfp_t extra_gfp)
 {
-	return alloc_pages_vma(alloc_hugepage_gfpmask(defrag, extra_gfp),
+	struct page *page;
+	page = alloc_pages_vma(alloc_hugepage_gfpmask(defrag, extra_gfp),
 			       HPAGE_PMD_ORDER, vma, haddr, nd);
+	if (page)
+		page_get_gp_cookie(page);
+	return page;
 }
 
 #ifndef CONFIG_NUMA
 static inline struct page *alloc_hugepage(int defrag)
 {
-	return alloc_pages(alloc_hugepage_gfpmask(defrag, 0),
+	struct page *page;
+	page = alloc_pages(alloc_hugepage_gfpmask(defrag, 0),
 			   HPAGE_PMD_ORDER);
+	if (page)
+		page_get_gp_cookie(page);
+	return page;
 }
 #endif
 
@@ -1209,7 +1217,7 @@ static void __split_huge_page_refcount(struct page *page)
 		BUG_ON(page_mapcount(page_tail));
 		page_tail->_mapcount = page->_mapcount;
 
-		BUG_ON(page_tail->mapping);
+		BUG_ON(page_tail->mapping);  /* see page_clear_gp_cookie() */
 		page_tail->mapping = page->mapping;
 
 		page_tail->index = ++head_index;
@@ -1387,9 +1395,11 @@ static void __split_huge_page(struct page *page,
 int split_huge_page(struct page *page)
 {
 	struct anon_vma *anon_vma;
-	int ret = 1;
+	int ret;
 
+retry:
 	BUG_ON(!PageAnon(page));
+	ret = 1;
 	anon_vma = page_lock_anon_vma(page);
 	if (!anon_vma)
 		goto out;
@@ -1397,6 +1407,21 @@ int split_huge_page(struct page *page)
 	if (!PageCompound(page))
 		goto out_unlock;
 
+	/*
+	 * Make sure the tail page counts are stable before splitting the page.
+	 * See the page_cache_get_speculative() comment in pagemap.h.
+	 */
+	if (!page_gp_cookie_elapsed(page)) {
+		page_unlock_anon_vma(anon_vma);
+		synchronize_rcu();
+		goto retry;
+	}
+
+	/*
+	 * Make sure page_tail->mapping is cleared before we split up the page.
+	 */
+	page_clear_gp_cookie(page);
+
 	BUG_ON(!PageSwapBacked(page));
 	__split_huge_page(page, anon_vma);
 	count_vm_event(THP_SPLIT);
-- 
1.7.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ