linux-kernel - [RFC][PATCH v2 8/8] Don't deactivate many touched page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20091210163429.2568.A69D9226@jp.fujitsu.com>
Date:	Thu, 10 Dec 2009 16:35:51 +0900 (JST)
From:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To:	LKML <linux-kernel@...r.kernel.org>
Cc:	kosaki.motohiro@...fujitsu.com, linux-mm <linux-mm@...ck.org>,
	Rik van Riel <riel@...hat.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Larry Woodman <lwoodman@...hat.com>
Subject: [RFC][PATCH v2  8/8] Don't deactivate many touched page

Changelog
 o from v1
   - Fix comments.
   - Rename too_many_young_bit_found() with too_many_referenced()
     [as Rik's mention].
 o from andrea's original patch
   - Rebase topon my patches.
   - Use list_cut_position/list_splice_tail pair instead
     list_del/list_add to make pte scan fairness.
   - Only use max young threshold when soft_try is true.
     It avoid wrong OOM sideeffect.
   - Return SWAP_AGAIN instead successful result if max
     young threshold exceed. It prevent the pages without clear
     pte young bit will be deactivated wrongly.
   - Add to treat ksm page logic

Many shared and frequently used page don't need deactivate and
try_to_unamp(). It's pointless while VM pressure is low, the page
might reactivate soon. it's only makes cpu wasting.

Then, This patch makes to stop pte scan if wipe_page_reference()
found lots young pte bit.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Reviewed-by: Rik van Riel <riel@...hat.com>
---
 include/linux/rmap.h |   18 ++++++++++++++++++
 mm/ksm.c             |    4 ++++
 mm/rmap.c            |   19 +++++++++++++++++++
 3 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 499972e..ddf2578 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -128,6 +128,24 @@ int wipe_page_reference_one(struct page *page,
 			    struct page_reference_context *refctx,
 			    struct vm_area_struct *vma, unsigned long address);
 
+#define MAX_YOUNG_BIT_CLEARED 64
+/*
+ * If VM pressure is low and the page has lots of active users, we only
+ * clear up to MAX_YOUNG_BIT_CLEARED accessed bits at a time.  Clearing
+ * accessed bits takes CPU time, needs TLB invalidate IPIs and could
+ * cause lock contention.  Since a heavily shared page is very likely
+ * to be used again soon, the cost outweighs the benefit of making such
+ * a heavily shared page a candidate for eviction.
+ */
+static inline
+int too_many_referenced(struct page_reference_context *refctx)
+{
+	if (refctx->soft_try &&
+	    refctx->referenced >= MAX_YOUNG_BIT_CLEARED)
+		return 1;
+	return 0;
+}
+
 enum ttu_flags {
 	TTU_UNMAP = 0,			/* unmap mode */
 	TTU_MIGRATION = 1,		/* migration mode */
diff --git a/mm/ksm.c b/mm/ksm.c
index 19559ae..e959c41 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1586,6 +1586,10 @@ again:
 						      rmap_item->address);
 			if (ret != SWAP_SUCCESS)
 				goto out;
+			if (too_many_referenced(refctx)) {
+				ret = SWAP_AGAIN;
+				goto out;
+			}
 			mapcount--;
 			if (!search_new_forks || !mapcount)
 				break;
diff --git a/mm/rmap.c b/mm/rmap.c
index cfda0a0..d66b8dc 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -473,6 +473,21 @@ static int wipe_page_reference_anon(struct page *page,
 		ret = wipe_page_reference_one(page, refctx, vma, address);
 		if (ret != SWAP_SUCCESS)
 			break;
+		if (too_many_referenced(refctx)) {
+			LIST_HEAD(tmp_list);
+
+			/*
+			 * Rotating the anon vmas around help spread out lock
+			 * pressure in the VM. It help to reduce heavy lock
+			 * contention.
+			 */
+			list_cut_position(&tmp_list,
+					  &vma->anon_vma_node,
+					  &anon_vma->head);
+			list_splice_tail(&tmp_list, &vma->anon_vma_node);
+			ret = SWAP_AGAIN;
+			break;
+		}
 		mapcount--;
 		if (!mapcount || refctx->maybe_mlocked)
 			break;
@@ -543,6 +558,10 @@ static int wipe_page_reference_file(struct page *page,
 		ret = wipe_page_reference_one(page, refctx, vma, address);
 		if (ret != SWAP_SUCCESS)
 			break;
+		if (too_many_referenced(refctx)) {
+			ret = SWAP_AGAIN;
+			break;
+		}
 		mapcount--;
 		if (!mapcount || refctx->maybe_mlocked)
 			break;
-- 
1.6.5.2



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/