[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAA1CXcDKOPk+7keQG43_0PzaAnVFLDrVNq=rnZK_m_QVFjk8og@mail.gmail.com>
Date: Wed, 28 May 2025 21:52:20 -0600
From: Nico Pache <npache@...hat.com>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>
Cc: linux-mm@...ck.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
david@...hat.com, ziy@...dia.com, lorenzo.stoakes@...cle.com,
Liam.Howlett@...cle.com, ryan.roberts@....com, dev.jain@....com,
corbet@....net, rostedt@...dmis.org, mhiramat@...nel.org,
mathieu.desnoyers@...icios.com, akpm@...ux-foundation.org, baohua@...nel.org,
willy@...radead.org, peterx@...hat.com, wangkefeng.wang@...wei.com,
usamaarif642@...il.com, sunnanyong@...wei.com, vishal.moola@...il.com,
thomas.hellstrom@...ux.intel.com, yang@...amperecomputing.com,
kirill.shutemov@...ux.intel.com, aarcange@...hat.com, raquini@...hat.com,
anshuman.khandual@....com, catalin.marinas@....com, tiwai@...e.de,
will@...nel.org, dave.hansen@...ux.intel.com, jack@...e.cz, cl@...two.org,
jglisse@...gle.com, surenb@...gle.com, zokeefe@...gle.com, hannes@...xchg.org,
rientjes@...gle.com, mhocko@...e.com, rdunlap@...radead.org
Subject: Re: [PATCH v7 00/12] khugepaged: mTHP support
On Wed, May 28, 2025 at 6:39 AM Baolin Wang
<baolin.wang@...ux.alibaba.com> wrote:
>
>
>
> On 2025/5/15 11:22, Nico Pache wrote:
> > The following series provides khugepaged and madvise collapse with the
> > capability to collapse anonymous memory regions to mTHPs.
> >
> > To achieve this we generalize the khugepaged functions to no longer depend
> > on PMD_ORDER. Then during the PMD scan, we keep track of chunks of pages
> > (defined by KHUGEPAGED_MTHP_MIN_ORDER) that are utilized. This info is
> > tracked using a bitmap. After the PMD scan is done, we do binary recursion
> > on the bitmap to find the optimal mTHP sizes for the PMD range. The
> > restriction on max_ptes_none is removed during the scan, to make sure we
> > account for the whole PMD range. When no mTHP size is enabled, the legacy
> > behavior of khugepaged is maintained. max_ptes_none will be scaled by the
> > attempted collapse order to determine how full a THP must be to be
> > eligible. If a mTHP collapse is attempted, but contains swapped out, or
> > shared pages, we dont perform the collapse.
> >
> > With the default max_ptes_none=511, the code should keep its most of its
> > original behavior. To exercise mTHP collapse we need to set
> > max_ptes_none<=255. With max_ptes_none > HPAGE_PMD_NR/2 you will
> > experience collapse "creep" and constantly promote mTHPs to the next
> > available size. This is due the fact that it will introduce at least 2x
> > the number of pages, and on a future scan will satisfy that condition once
> > again.
> >
> > Patch 1: Refactor/rename hpage_collapse
> > Patch 2: Some refactoring to combine madvise_collapse and khugepaged
> > Patch 3-5: Generalize khugepaged functions for arbitrary orders
> > Patch 6-9: The mTHP patches
> > Patch 10-11: Tracing/stats
> > Patch 12: Documentation
>
> When I tested 64K mTHP collapse and disabled PMD-sized THP, I found that
> khugepaged couldn't scan and collapse 64K mTHP. I send out two fix
> patches[1], and with these patches applied, 64K mTHP collapse works
> well. I hope my two patches can be folded into your next version series
> if you think there are no issues. Thanks.
Thank you for looking into that and fixing it, I had originally
decided to only allow khugepaged to collapse to mTHP if the PMD size
was enabled as well. It was on my todo list :) I'll work on adding
your patches to my set, and do some proper testing again!
>
> [1]
> https://lore.kernel.org/all/ac9ed6d71b439611f9c94b3506a8ce975d4636e9.1748435162.git.baolin.wang@linux.alibaba.com/
>
Powered by blists - more mailing lists