[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b53a16f67c93a3fe65e78092069ad135edf00eff.1770645603.git.baolin.wang@linux.alibaba.com>
Date: Mon, 9 Feb 2026 22:07:28 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: akpm@...ux-foundation.org,
david@...nel.org,
catalin.marinas@....com,
will@...nel.org
Cc: lorenzo.stoakes@...cle.com,
ryan.roberts@....com,
Liam.Howlett@...cle.com,
vbabka@...e.cz,
rppt@...nel.org,
surenb@...gle.com,
mhocko@...e.com,
riel@...riel.com,
harry.yoo@...cle.com,
jannh@...gle.com,
willy@...radead.org,
baohua@...nel.org,
dev.jain@....com,
baolin.wang@...ux.alibaba.com,
linux-mm@...ck.org,
linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: [PATCH v6 5/5] mm: rmap: support batched unmapping for file large folios
Similar to folio_referenced_one(), we can apply batched unmapping for file
large folios to optimize the performance of file folios reclamation.
Barry previously implemented batched unmapping for lazyfree anonymous large
folios[1] and did not further optimize anonymous large folios or file-backed
large folios at that stage. As for file-backed large folios, the batched
unmapping support is relatively straightforward, as we only need to clear
the consecutive (present) PTE entries for file-backed large folios.
Note that it's not ready to support batched unmapping for uffd case, so
let's still fallback to per-page unmapping for the uffd case.
Performance testing:
Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and try to
reclaim 8G file-backed folios via the memory.reclaim interface. I can observe
75% performance improvement on my Arm64 32-core server (and 50%+ improvement
on my X86 machine) with this patch.
W/o patch:
real 0m1.018s
user 0m0.000s
sys 0m1.018s
W/ patch:
real 0m0.249s
user 0m0.000s
sys 0m0.249s
[1] https://lore.kernel.org/all/20250214093015.51024-4-21cnbao@gmail.com/T/#u
Reviewed-by: Ryan Roberts <ryan.roberts@....com>
Acked-by: Barry Song <baohua@...nel.org>
Reviewed-by: Harry Yoo <harry.yoo@...cle.com>
Signed-off-by: Baolin Wang <baolin.wang@...ux.alibaba.com>
---
mm/rmap.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index 8807f8a7df28..43cb9ac6f523 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1945,12 +1945,16 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio,
end_addr = pmd_addr_end(addr, vma->vm_end);
max_nr = (end_addr - addr) >> PAGE_SHIFT;
- /* We only support lazyfree batching for now ... */
- if (!folio_test_anon(folio) || folio_test_swapbacked(folio))
+ /* We only support lazyfree or file folios batching for now ... */
+ if (folio_test_anon(folio) && folio_test_swapbacked(folio))
return 1;
+
if (pte_unused(pte))
return 1;
+ if (userfaultfd_wp(vma))
+ return 1;
+
return folio_pte_batch(folio, pvmw->pte, pte, max_nr);
}
@@ -2313,7 +2317,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
*
* See Documentation/mm/mmu_notifier.rst
*/
- dec_mm_counter(mm, mm_counter_file(folio));
+ add_mm_counter(mm, mm_counter_file(folio), -nr_pages);
}
discard:
if (unlikely(folio_test_hugetlb(folio))) {
--
2.47.3
Powered by blists - more mailing lists