linux-kernel - Re: [PATCH v5 5/5] mm: rmap: support batched unmapping for file large folios

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20260106132203.kdxfvootlkxzex2l@master>
Date: Tue, 6 Jan 2026 13:22:03 +0000
From: Wei Yang <richard.weiyang@...il.com>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>
Cc: akpm@...ux-foundation.org, david@...nel.org, catalin.marinas@....com,
	will@...nel.org, lorenzo.stoakes@...cle.com, ryan.roberts@....com,
	Liam.Howlett@...cle.com, vbabka@...e.cz, rppt@...nel.org,
	surenb@...gle.com, mhocko@...e.com, riel@...riel.com,
	harry.yoo@...cle.com, jannh@...gle.com, willy@...radead.org,
	baohua@...nel.org, dev.jain@....com, linux-mm@...ck.org,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 5/5] mm: rmap: support batched unmapping for file
 large folios

On Fri, Dec 26, 2025 at 02:07:59PM +0800, Baolin Wang wrote:
>Similar to folio_referenced_one(), we can apply batched unmapping for file
>large folios to optimize the performance of file folios reclamation.
>
>Barry previously implemented batched unmapping for lazyfree anonymous large
>folios[1] and did not further optimize anonymous large folios or file-backed
>large folios at that stage. As for file-backed large folios, the batched
>unmapping support is relatively straightforward, as we only need to clear
>the consecutive (present) PTE entries for file-backed large folios.
>
>Performance testing:
>Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and try to
>reclaim 8G file-backed folios via the memory.reclaim interface. I can observe
>75% performance improvement on my Arm64 32-core server (and 50%+ improvement
>on my X86 machine) with this patch.
>
>W/o patch:
>real    0m1.018s
>user    0m0.000s
>sys     0m1.018s
>
>W/ patch:
>real	0m0.249s
>user	0m0.000s
>sys	0m0.249s
>
>[1] https://lore.kernel.org/all/20250214093015.51024-4-21cnbao@gmail.com/T/#u
>Reviewed-by: Ryan Roberts <ryan.roberts@....com>
>Acked-by: Barry Song <baohua@...nel.org>
>Signed-off-by: Baolin Wang <baolin.wang@...ux.alibaba.com>
>---
> mm/rmap.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
>diff --git a/mm/rmap.c b/mm/rmap.c
>index 985ab0b085ba..e1d16003c514 100644
>--- a/mm/rmap.c
>+++ b/mm/rmap.c
>@@ -1863,9 +1863,10 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio,
> 	end_addr = pmd_addr_end(addr, vma->vm_end);
> 	max_nr = (end_addr - addr) >> PAGE_SHIFT;
> 
>-	/* We only support lazyfree batching for now ... */
>-	if (!folio_test_anon(folio) || folio_test_swapbacked(folio))
>+	/* We only support lazyfree or file folios batching for now ... */
>+	if (folio_test_anon(folio) && folio_test_swapbacked(folio))
> 		return 1;
>+
> 	if (pte_unused(pte))
> 		return 1;
> 
>@@ -2231,7 +2232,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
> 			 *
> 			 * See Documentation/mm/mmu_notifier.rst
> 			 */
>-			dec_mm_counter(mm, mm_counter_file(folio));
>+			add_mm_counter(mm, mm_counter_file(folio), -nr_pages);
> 		}
> discard:
> 		if (unlikely(folio_test_hugetlb(folio))) {
>-- 
>2.47.3
>

Hi, Baolin

When reading your patch, I come up one small question.

Current try_to_unmap_one() has following structure:

    try_to_unmap_one()
        while (page_vma_mapped_walk(&pvmw)) {
            nr_pages = folio_unmap_pte_batch()

            if (nr_pages = folio_nr_pages(folio))
                goto walk_done;
        }

I am thinking what if nr_pages > 1 but nr_pages != folio_nr_pages().

If my understanding is correct, page_vma_mapped_walk() would start from
(pvmw->address + PAGE_SIZE) in next iteration, but we have already cleared to
(pvmw->address + nr_pages * PAGE_SIZE), right?

Not sure my understanding is correct, if so do we have some reason not to
skip the cleared range?

-- 
Wei Yang
Help you, Help me