lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 16 Jul 2014 17:59:22 -0700 (PDT)
From:	David Rientjes <>
To:	Andrew Morton <>
cc:	Andrea Arcangeli <>,
	Vlastimil Babka <>, Mel Gorman <>,
	Rik van Riel <>,
	"Kirill A. Shutemov" <>,
	Bob Liu <>,,
Subject: [patch v3] mm, thp: only collapse hugepages to nodes with affinity
 for zone_reclaim_mode

Commit 9f1b868a13ac ("mm: thp: khugepaged: add policy for finding target 
node") improved the previous khugepaged logic which allocated a 
transparent hugepages from the node of the first page being collapsed.

However, it is still possible to collapse pages to remote memory which may 
suffer from additional access latency.  With the current policy, it is 
possible that 255 pages (with PAGE_SHIFT == 12) will be collapsed remotely 
if the majority are allocated from that node.

When zone_reclaim_mode is enabled, it means the VM should make every attempt
to allocate locally to prevent NUMA performance degradation.  In this case,
we do not want to collapse hugepages to remote nodes that would suffer from
increased access latency.  Thus, when zone_reclaim_mode is enabled, only
allow collapsing to nodes with RECLAIM_DISTANCE or less.

There is no functional change for systems that disable zone_reclaim_mode.

Signed-off-by: David Rientjes <>
 v2: only change behavior for zone_reclaim_mode per Dave Hansen
 v3: optimization based on previous node counts per Vlastimil Babka

 mm/huge_memory.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2234,6 +2234,30 @@ static void khugepaged_alloc_sleep(void)
 static int khugepaged_node_load[MAX_NUMNODES];
+static bool khugepaged_scan_abort(int nid)
+	int i;
+	/*
+	 * If zone_reclaim_mode is disabled, then no extra effort is made to
+	 * allocate memory locally.
+	 */
+	if (!zone_reclaim_mode)
+		return false;
+	/* If there is a count for this node already, it must be acceptable */
+	if (khugepaged_node_load[nid])
+		return false;
+	for (i = 0; i < MAX_NUMNODES; i++) {
+		if (!khugepaged_node_load[i])
+			continue;
+		if (node_distance(nid, i) > RECLAIM_DISTANCE)
+			return true;
+	}
+	return false;
 static int khugepaged_find_target_node(void)
 	static int last_khugepaged_target_node = NUMA_NO_NODE;
@@ -2309,6 +2333,11 @@ static struct page
 	return *hpage;
+static bool khugepaged_scan_abort(int nid)
+	return false;
 static int khugepaged_find_target_node(void)
 	return 0;
@@ -2545,6 +2574,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm,
 		 * hit record.
 		node = page_to_nid(page);
+		if (khugepaged_scan_abort(node))
+			goto out_unmap;
 		VM_BUG_ON_PAGE(PageCompound(page), page);
 		if (!PageLRU(page) || PageLocked(page) || !PageAnon(page))
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

Powered by blists - more mailing lists