[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <80a0cfd545a01ce0470a727cb961a5e0a1532d48.1637778851.git.hasanalmaruf@fb.com>
Date: Wed, 24 Nov 2021 13:58:30 -0500
From: Hasan Al Maruf <hasan3050@...il.com>
To: dave.hansen@...ux.intel.com, ying.huang@...el.com,
yang.shi@...ux.alibaba.com, mgorman@...hsingularity.net,
riel@...riel.com, hannes@...xchg.org
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: [PATCH 5/5] active LRU-based promotion to avoid ping-pong
Whenever a remote hint-fault happens on a page, the default NUMA
balancing promotes the page without checking its state. As a result,
cold pages with very infrequent accesses, can still be the promotion
candidates. Once promoted to the local node, these type of pages may
shortly become the demotion candidate if the toptier nodes are always
under pressure. Thus, promotion traffics generated from infrequently
accessed pages can easily fill up the toptier node's reclaimed free
spaces and eventually generate a higher demotion traffic for the non-
toptier node. This demotion-promotion ping-pong causes unnecessary
traffic over the memory nodes and can negatively impact on the
performance of memory bound applications.
To solve this ping-pong issue, instead of instant promotion, we check
a page's age through its position in the LRU list. If the faulted page
is in inactive LRU, then we don't instantly consider the page as the
promotion candidate as it might be an infrequently accessed pages. We
only consider the faulted pages that are in the active LRUs (either
of anon or file active LRU) as the promotion candidate. This approach
significantly reduce the promotion traffic and always maintain a
satisfactory amount of free memory on the toptier node to support both
new allocations and promotion from non-tortier nodes.
Signed-off-by: Hasan Al Maruf <hasanalmaruf@...com>
---
mm/memory.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/mm/memory.c b/mm/memory.c
index 314fe3b2f462..1c76f074784a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4202,6 +4202,19 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
last_cpupid = page_cpupid_last(page);
page_nid = page_to_nid(page);
+
+ /* Only migrate pages that are active on non-toptier node */
+ if (numa_promotion_tiered_enabled &&
+ !node_is_toptier(page_nid) &&
+ !PageActive(page)) {
+ count_vm_numa_event(NUMA_HINT_FAULTS);
+ if (page_nid == numa_node_id())
+ count_vm_numa_event(NUMA_HINT_FAULTS_LOCAL);
+ mark_page_accessed(page);
+ pte_unmap_unlock(vmf->pte, vmf->ptl);
+ goto out;
+ }
+
target_nid = numa_migrate_prep(page, vma, vmf->address, page_nid,
&flags);
pte_unmap_unlock(vmf->pte, vmf->ptl);
--
2.30.2
Powered by blists - more mailing lists