[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7d423d50-47e0-4c97-abaa-1fa865ec3e42@redhat.com>
Date: Mon, 1 Sep 2025 18:21:02 +0200
From: David Hildenbrand <david@...hat.com>
To: Nico Pache <npache@...hat.com>, linux-mm@...ck.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-trace-kernel@...r.kernel.org
Cc: ziy@...dia.com, baolin.wang@...ux.alibaba.com,
lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, ryan.roberts@....com,
dev.jain@....com, corbet@....net, rostedt@...dmis.org, mhiramat@...nel.org,
mathieu.desnoyers@...icios.com, akpm@...ux-foundation.org,
baohua@...nel.org, willy@...radead.org, peterx@...hat.com,
wangkefeng.wang@...wei.com, usamaarif642@...il.com, sunnanyong@...wei.com,
vishal.moola@...il.com, thomas.hellstrom@...ux.intel.com,
yang@...amperecomputing.com, kirill.shutemov@...ux.intel.com,
aarcange@...hat.com, raquini@...hat.com, anshuman.khandual@....com,
catalin.marinas@....com, tiwai@...e.de, will@...nel.org,
dave.hansen@...ux.intel.com, jack@...e.cz, cl@...two.org,
jglisse@...gle.com, surenb@...gle.com, zokeefe@...gle.com,
hannes@...xchg.org, rientjes@...gle.com, mhocko@...e.com,
rdunlap@...radead.org, hughd@...gle.com
Subject: Re: [PATCH v10 00/13] khugepaged: mTHP support
On 19.08.25 15:41, Nico Pache wrote:
> The following series provides khugepaged with the capability to collapse
> anonymous memory regions to mTHPs.
>
> To achieve this we generalize the khugepaged functions to no longer depend
> on PMD_ORDER. Then during the PMD scan, we use a bitmap to track chunks of
> pages (defined by KHUGEPAGED_MTHP_MIN_ORDER) that are utilized. After the
> PMD scan is done, we do binary recursion on the bitmap to find the optimal
> mTHP sizes for the PMD range. The restriction on max_ptes_none is removed
> during the scan, to make sure we account for the whole PMD range. When no
> mTHP size is enabled, the legacy behavior of khugepaged is maintained.
> max_ptes_none will be scaled by the attempted collapse order to determine
> how full a mTHP must be to be eligible for the collapse to occur. If a
> mTHP collapse is attempted, but contains swapped out, or shared pages, we
> don't perform the collapse. It is now also possible to collapse to mTHPs
> without requiring the PMD THP size to be enabled.
>
> With the default max_ptes_none=511, the code should keep its most of its
> original behavior. When enabling multiple adjacent (m)THP sizes we need to
> set max_ptes_none<=255. With max_ptes_none > HPAGE_PMD_NR/2 you will
> experience collapse "creep" and constantly promote mTHPs to the next
> available size. This is due the fact that a collapse will introduce at
> least 2x the number of pages, and on a future scan will satisfy the
> promotion condition once again.
>
> Patch 1: Refactor/rename hpage_collapse
> Patch 2: Some refactoring to combine madvise_collapse and khugepaged
> Patch 3-5: Generalize khugepaged functions for arbitrary orders
> Patch 6-8: The mTHP patches
> Patch 9-10: Allow khugepaged to operate without PMD enabled
> Patch 11-12: Tracing/stats
> Patch 13: Documentation
Would it be feasible to start with simply not supporting the
max_pte_none parameter in the first version, just like we won't support
max_pte_swapped/max_pte_shared in the first version?
That gives us more time to think about how to use/modify the old interface.
For example, I could envision a ratio-based interface, or as discussed
with Lorenzo a simple boolean. We could make the existing max_ptes*
interface backwards compatible then.
That also gives us the opportunity to think about the creep problem
separately.
I'm sure initial mTHP collapse will be valuable even without support for
that weird set of parameters.
Would there be implementation-wise a problem?
But let me think further about the creep problem ... :/
--
Cheers
David / dhildenb
Powered by blists - more mailing lists