[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9e1b4602-2b4c-4cfb-9f06-1799d3c0d387@lucifer.local>
Date: Tue, 26 Aug 2025 11:43:50 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Dev Jain <dev.jain@....com>
Cc: David Hildenbrand <david@...hat.com>, Nico Pache <npache@...hat.com>,
linux-mm@...ck.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
ziy@...dia.com, baolin.wang@...ux.alibaba.com, Liam.Howlett@...cle.com,
ryan.roberts@....com, corbet@....net, rostedt@...dmis.org,
mhiramat@...nel.org, mathieu.desnoyers@...icios.com,
akpm@...ux-foundation.org, baohua@...nel.org, willy@...radead.org,
peterx@...hat.com, wangkefeng.wang@...wei.com, usamaarif642@...il.com,
sunnanyong@...wei.com, vishal.moola@...il.com,
thomas.hellstrom@...ux.intel.com, yang@...amperecomputing.com,
kirill.shutemov@...ux.intel.com, aarcange@...hat.com,
raquini@...hat.com, anshuman.khandual@....com, catalin.marinas@....com,
tiwai@...e.de, will@...nel.org, dave.hansen@...ux.intel.com,
jack@...e.cz, cl@...two.org, jglisse@...gle.com, surenb@...gle.com,
zokeefe@...gle.com, hannes@...xchg.org, rientjes@...gle.com,
mhocko@...e.com, rdunlap@...radead.org, hughd@...gle.com
Subject: Re: [PATCH v10 00/13] khugepaged: mTHP support
On Fri, Aug 22, 2025 at 09:03:41PM +0530, Dev Jain wrote:
>
> On 22/08/25 8:19 pm, Lorenzo Stoakes wrote:
> > On Fri, Aug 22, 2025 at 04:10:35PM +0200, David Hildenbrand wrote:
> > > > > Once could also easily support the value 255 (HPAGE_PMD_NR / 2- 1), but not sure
> > > > > if we have to add that for now.
> > > > Yeah not so sure about this, this is a 'just have to know' too, and yes you
> > > > might add it to the docs, but people are going to be mightily confused, esp if
> > > > it's a calculated value.
> > > >
> > > > I don't see any other way around having a separate tunable if we don't just have
> > > > something VERY simple like on/off.
> > > Yeah, not advocating that we add support for other values than 0/511,
> > > really.
> > Yeah I'm fine with 0/511.
> >
> > > > Also the mentioned issue sounds like something that needs to be fixed elsewhere
> > > > honestly in the algorithm used to figure out mTHP ranges (I may be wrong - and
> > > > happy to stand corrected if this is somehow inherent, but reallly feels that
> > > > way).
> > > I think the creep is unavoidable for certain values.
> > >
> > > If you have the first two pages of a PMD area populated, and you allow for
> > > at least half of the #PTEs to be non/zero, you'd collapse first a
> > > order-2 folio, then and order-3 ... until you reached PMD order.
> > Feels like we should be looking at this in reverse? What's the largest, then
> > next largest, then etc.?
> >
> > Surely this is the sensible way of doing it?
>
> What David means to say is, for example, suppose all orders are enabled,
> and we fail to collapse for order-9, then order-8, then order-7, and so on,
> *only* because the distribution of ptes did not obey the scaled max_ptes_none.
> Let order-4 collapse succeed.
Ah so it is the overhead of this that's the problem?
All roads lead to David's suggestion imo.
> > By having 0/511 we can really simplify the 'scaling' logic too which would be
> > fantastic! :)
>
> FWIW here was my implementation of this thing, for ease of everyone:
> https://lore.kernel.org/all/20250211111326.14295-17-dev.jain@arm.com/
That's fine, but I really think we should just replace all this stuff with a
boolean, and change the interface to max_ptes to set boolean if 511, or clear if
0.
Cheers, Lorenzo
Powered by blists - more mailing lists