[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a1942809-ad8b-4a8d-85c0-74ffa2fbb53d@lucifer.local>
Date: Fri, 22 Aug 2025 15:49:28 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: David Hildenbrand <david@...hat.com>
Cc: Nico Pache <npache@...hat.com>, Dev Jain <dev.jain@....com>,
linux-mm@...ck.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
ziy@...dia.com, baolin.wang@...ux.alibaba.com, Liam.Howlett@...cle.com,
ryan.roberts@....com, corbet@....net, rostedt@...dmis.org,
mhiramat@...nel.org, mathieu.desnoyers@...icios.com,
akpm@...ux-foundation.org, baohua@...nel.org, willy@...radead.org,
peterx@...hat.com, wangkefeng.wang@...wei.com, usamaarif642@...il.com,
sunnanyong@...wei.com, vishal.moola@...il.com,
thomas.hellstrom@...ux.intel.com, yang@...amperecomputing.com,
kirill.shutemov@...ux.intel.com, aarcange@...hat.com,
raquini@...hat.com, anshuman.khandual@....com, catalin.marinas@....com,
tiwai@...e.de, will@...nel.org, dave.hansen@...ux.intel.com,
jack@...e.cz, cl@...two.org, jglisse@...gle.com, surenb@...gle.com,
zokeefe@...gle.com, hannes@...xchg.org, rientjes@...gle.com,
mhocko@...e.com, rdunlap@...radead.org, hughd@...gle.com
Subject: Re: [PATCH v10 00/13] khugepaged: mTHP support
On Fri, Aug 22, 2025 at 04:10:35PM +0200, David Hildenbrand wrote:
> > > Once could also easily support the value 255 (HPAGE_PMD_NR / 2- 1), but not sure
> > > if we have to add that for now.
> >
> > Yeah not so sure about this, this is a 'just have to know' too, and yes you
> > might add it to the docs, but people are going to be mightily confused, esp if
> > it's a calculated value.
> >
> > I don't see any other way around having a separate tunable if we don't just have
> > something VERY simple like on/off.
>
> Yeah, not advocating that we add support for other values than 0/511,
> really.
Yeah I'm fine with 0/511.
>
> >
> > Also the mentioned issue sounds like something that needs to be fixed elsewhere
> > honestly in the algorithm used to figure out mTHP ranges (I may be wrong - and
> > happy to stand corrected if this is somehow inherent, but reallly feels that
> > way).
>
> I think the creep is unavoidable for certain values.
>
> If you have the first two pages of a PMD area populated, and you allow for
> at least half of the #PTEs to be non/zero, you'd collapse first a
> order-2 folio, then and order-3 ... until you reached PMD order.
Feels like we should be looking at this in reverse? What's the largest, then
next largest, then etc.?
Surely this is the sensible way of doing it?
>
> So for now we really should just support 0 / 511 to say "don't collapse if
> there are holes" vs. "always collapse if there is at least one pte used".
Yes.
>
> >
> > >
> > > Because, as raised in the past, I'm afraid nobody on this earth has a clue how
> > > to set this parameter to values different to 0 (don't waste memory with khugepaged)
> > > and 511 (page fault behavior).
> >
> > Yup
> >
> > >
> > >
> > > If any other value is set, essentially
> > > pr_warn("Unsupported 'max_ptes_none' value for mTHP collapse");
> > >
> > > for now and just disable it.
> >
> > Hmm but under what circumstances? I would just say unsupported value not mention
> > mTHP or people who don't use mTHP might find that confusing.
>
> Well, we can check whether any mTHP size is enabled while the value is set
> to something unexpected. We can then even print the problematic sizes if we
> have to.
Ack
>
> We could also just just say that if the value is set to something else than
> 511 (which is the default), it will be treated as being "0" when collapsing
> mthp, instead of doing any scaling.
Or we could make it an error to set anything but 0, 511, but on the other hand
that's likely to break userspace so yeah probably not.
Maybe have a warning saying 'this is no longer supported and will be ignored'
then set the value to 0 for anything but 511 or 0.
Then can remove the warning later.
By having 0/511 we can really simplify the 'scaling' logic too which would be
fantastic! :)
Cheers, Lorenzo
Powered by blists - more mailing lists