lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20250403141032.22743-1-donettom@linux.ibm.com>
Date: Thu,  3 Apr 2025 09:10:32 -0500
From: Donet Tom <donettom@...ux.ibm.com>
To: Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        Gregory Price <gourry@...rry.net>,
        Matthew Wilcox <willy@...radead.org>, Yu Zhao <yuzhao@...gle.com>
Cc: Ritesh Harjani <ritesh.list@...il.com>, linux-kernel@...r.kernel.org,
        aneesh.kumar@...nel.org, David Hildenbrand <david@...hat.com>,
        Huang Ying <ying.huang@...ux.alibaba.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Donet Tom <donettom@...ux.ibm.com>
Subject: [RFC PATCH v2]  mm/swap.c: Enable promotion of unmapped MGLRU page cache pages

This patch is based on patch [1], which introduced support for
promoting unmapped normal LRU page cache pages. Here, we extend
the functionality to support promotion of MGLRU page cache pages.

An MGLRU page cache page is eligible for promotion when:

1. Memory Tiering and pagecache_promotion_enabled are enabled
2. It resides in a lower memory tier.
3. It is referenced.
4. It is part of the working set.
5. folio reference count is maximun (LRU_REFS_MASK).

When a page is accessed through a file descriptor, folio_inc_refs()
is invoked. The first access will set the folio’s referenced flag,
and subsequent accesses will increment the reference count in the
folio flag (reference counter size in folio flags is 2 bits). Once
the referenced flag is set, and the folio’s reference count reaches
the maximum value (LRU_REFS_MASK), the working set flag will be set
as well.

If a folio has both the referenced and working set flags set, and its
reference count equals LRU_REFS_MASK, it becomes a good candidate for
promotion. These pages will be added to the promotion list. The
per-process task task_numa_promotion_work() takes the pages from the
promotion list and promotes them to a higher memory tier.

In the MGLRU, for folios accessed through a file descriptor, if the
folio’s referenced and working set flags are set, and the folio's
reference count is equal to LRU_REFS_MASK, the folio is lazily
promoted to the second oldest generation in the eviction path. When
folio_inc_gen() does this, it clears the LRU_REFS_FLAGS so that
lru_gen_inc_refs() can start over.

Test process:
We measured the read time in below scenarios for both LRU and MGLRU.
Scenario 1: Pages are on Lower tier + promotion off
Scenario 2: Pages are on Lower tier + promotion on
Scenario 3: Pages are on higher tier

Test Results MGLRU
----------------------------------------------------------------
Pages on higher   | Pages Lower tier |  Pages on Lower Tier    |
   Tier           |  promotion off   |   Promotion On          |
----------------------------------------------------------------
  0.48s           |    1.6s          |During Promotion - 3.3s  |
                  |                  |After Promotion  - 0.48s |
                  |                  |                         |
----------------------------------------------------------------

Test Results LRU
----------------------------------------------------------------
Pages on higher   | Pages Lower tier |  Pages on Lower Tier    |
   Tier           |  promotion off   |   Promotion On          |
----------------------------------------------------------------
   0.48s          |    1.6s          |During Promotion - 3.3s  |
                  |                  |After Promotion  - 0.48s |
                  |                  |                         |
----------------------------------------------------------------

MGLRU and LRU are showing similar performance benefit.

[1] https://lore.kernel.org/all/20250107000346.1338481-1-gourry@gourry.net/

Signed-off-by: Donet Tom <donettom@...ux.ibm.com>
---
v1->v2

In V1, the folios that were part of the memcg and the active MGLRU list
were being promoted. However, in MGLRU, file pages accessed through
file descriptors are moved to the second oldest generation. This second
oldest generation may not necessarily be part of the active list.
Furthermore, this movement to the second oldest generation only happens
in the eviction path, so if the system is not under memory pressure,
this movement will not occur. As a result, hot pages can be present in
any generation. If the reference count is at its maximum and the
referenced and working set flags are set, the page becomes a candidate
for promotion.

v1 - https://lore.kernel.org/all/20250115120625.3785-1-donettom@linux.ibm.com/
---
 mm/swap.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/swap.c b/mm/swap.c
index b2341bc18452..f3c19d563556 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -399,8 +399,13 @@ static void lru_gen_inc_refs(struct folio *folio)
 
 	do {
 		if ((old_flags & LRU_REFS_MASK) == LRU_REFS_MASK) {
-			if (!folio_test_workingset(folio))
+			if (!folio_test_workingset(folio)) {
 				folio_set_workingset(folio);
+			} else if (!folio_test_isolated(folio) &&
+				  (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
+				   numa_pagecache_promotion_enabled) {
+				promotion_candidate(folio);
+			}
 			return;
 		}
 
-- 
2.43.5


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ