[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250903080839.wuivg2u7smyuxo5e@master>
Date: Wed, 3 Sep 2025 08:08:39 +0000
From: Wei Yang <richard.weiyang@...il.com>
To: Dev Jain <dev.jain@....com>
Cc: akpm@...ux-foundation.org, david@...hat.com, kas@...nel.org,
willy@...radead.org, hughd@...gle.com, ziy@...dia.com,
baolin.wang@...ux.alibaba.com, lorenzo.stoakes@...cle.com,
Liam.Howlett@...cle.com, npache@...hat.com, ryan.roberts@....com,
baohua@...nel.org, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
On Wed, Sep 03, 2025 at 11:16:34AM +0530, Dev Jain wrote:
>Currently khugepaged does not collapse a region which does not have a
>single writable page. This is wasteful since non-writable VMAs mapped by
>the application won't benefit from THP collapse. Therefore, remove this
>restriction and allow khugepaged to collapse a VMA with arbitrary
>protections.
>
>Along with this, currently MADV_COLLAPSE does not perform a collapse on a
>non-writable VMA, and this restriction is nowhere to be found on the
>manpage - the restriction itself sounds wrong to me since the user knows
>the protection of the memory it has mapped, so collapsing read-only
>memory via madvise() should be a choice of the user which shouldn't
>be overriden by the kernel.
>
>On an arm64 machine, an average of 5% improvement is seen on some mmtests
>benchmarks, particularly hackbench, with a maximum improvement of 12%.
>
>Signed-off-by: Dev Jain <dev.jain@....com>
>---
[...]
> mm/khugepaged.c | 9 ++-------
> 1 file changed, 2 insertions(+), 7 deletions(-)
>
>diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>index 4ec324a4c1fe..a0f1df2a7ae6 100644
>--- a/mm/khugepaged.c
>+++ b/mm/khugepaged.c
>@@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
> writable = true;
> }
>
>- if (unlikely(!writable)) {
>- result = SCAN_PAGE_RO;
>- } else if (unlikely(cc->is_khugepaged && !referenced)) {
Would this cause more memory usage in system?
For example, one application would fork itself many times. It executable area
is read only, so all of them share one copy in memory.
Now we may collapse the range and create one copy for each process.
Ok, we have max_ptes_shared, while if some ptes are none, could it still do
collapse?
Maybe this is not realistic, just curious.
>+ if (unlikely(cc->is_khugepaged && !referenced)) {
> result = SCAN_LACK_REFERENCED_PAGE;
> } else {
> result = SCAN_SUCCEED;
>@@ -1421,9 +1419,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
> mmu_notifier_test_young(vma->vm_mm, _address)))
> referenced++;
> }
>- if (!writable) {
>- result = SCAN_PAGE_RO;
>- } else if (cc->is_khugepaged &&
>+ if (cc->is_khugepaged &&
> (!referenced ||
> (unmapped && referenced < HPAGE_PMD_NR / 2))) {
> result = SCAN_LACK_REFERENCED_PAGE;
>@@ -2830,7 +2826,6 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
> case SCAN_PMD_NULL:
> case SCAN_PTE_NON_PRESENT:
> case SCAN_PTE_UFFD_WP:
>- case SCAN_PAGE_RO:
> case SCAN_LACK_REFERENCED_PAGE:
> case SCAN_PAGE_NULL:
> case SCAN_PAGE_COUNT:
>--
>2.30.2
>
--
Wei Yang
Help you, Help me
Powered by blists - more mailing lists