[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250926093343.1000-6-laoar.shao@gmail.com>
Date: Fri, 26 Sep 2025 17:33:36 +0800
From: Yafang Shao <laoar.shao@...il.com>
To: akpm@...ux-foundation.org,
david@...hat.com,
ziy@...dia.com,
baolin.wang@...ux.alibaba.com,
lorenzo.stoakes@...cle.com,
Liam.Howlett@...cle.com,
npache@...hat.com,
ryan.roberts@....com,
dev.jain@....com,
hannes@...xchg.org,
usamaarif642@...il.com,
gutierrez.asier@...wei-partners.com,
willy@...radead.org,
ast@...nel.org,
daniel@...earbox.net,
andrii@...nel.org,
ameryhung@...il.com,
rientjes@...gle.com,
corbet@....net,
21cnbao@...il.com,
shakeel.butt@...ux.dev,
tj@...nel.org,
lance.yang@...ux.dev
Cc: bpf@...r.kernel.org,
linux-mm@...ck.org,
linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org,
Yafang Shao <laoar.shao@...il.com>
Subject: [PATCH v8 mm-new 05/12] mm: thp: decouple THP allocation between swap and page fault paths
The new BPF capability enables finer-grained THP policy decisions by
introducing separate handling for swap faults versus normal page faults.
As highlighted by Barry:
We’ve observed that swapping in large folios can lead to more
swap thrashing for some workloads- e.g. kernel build. Consequently,
some workloads might prefer swapping in smaller folios than those
allocated by alloc_anon_folio().
While prtcl() could potentially be extended to leverage this new policy,
doing so would require modifications to the uAPI.
Signed-off-by: Yafang Shao <laoar.shao@...il.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Barry Song <21cnbao@...il.com>
---
include/linux/huge_mm.h | 3 ++-
mm/huge_memory.c | 2 +-
mm/memory.c | 2 +-
3 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index fea94c059bed..bd30694f6a9c 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -97,9 +97,10 @@ extern struct kobj_attribute thpsize_shmem_enabled_attr;
enum tva_type {
TVA_SMAPS, /* Exposing "THPeligible:" in smaps. */
- TVA_PAGEFAULT, /* Serving a page fault. */
+ TVA_PAGEFAULT, /* Serving a non-swap page fault. */
TVA_KHUGEPAGED, /* Khugepaged collapse. */
TVA_FORCED_COLLAPSE, /* Forced collapse (e.g. MADV_COLLAPSE). */
+ TVA_SWAP_PAGEFAULT, /* serving a swap page fault. */
};
#define thp_vma_allowable_order(vma, type, order) \
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 1ac476fe6dc5..08372dfcb41a 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -102,7 +102,7 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
unsigned long orders)
{
const bool smaps = type == TVA_SMAPS;
- const bool in_pf = type == TVA_PAGEFAULT;
+ const bool in_pf = (type == TVA_PAGEFAULT || type == TVA_SWAP_PAGEFAULT);
const bool forced_collapse = type == TVA_FORCED_COLLAPSE;
unsigned long supported_orders;
vm_flags_t vm_flags = vma->vm_flags;
diff --git a/mm/memory.c b/mm/memory.c
index cd04e4894725..58ea0f93f79e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4558,7 +4558,7 @@ static struct folio *alloc_swap_folio(struct vm_fault *vmf)
* Get a list of all the (large) orders below PMD_ORDER that are enabled
* and suitable for swapping THP.
*/
- orders = thp_vma_allowable_orders(vma, TVA_PAGEFAULT,
+ orders = thp_vma_allowable_orders(vma, TVA_SWAP_PAGEFAULT,
BIT(PMD_ORDER) - 1);
orders = thp_vma_suitable_orders(vma, vmf->address, orders);
orders = thp_swap_suitable_orders(swp_offset(entry),
--
2.47.3
Powered by blists - more mailing lists