[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <01d679f2-fd64-472c-b9dc-ebe87268ce82@lucifer.local>
Date: Wed, 25 Jun 2025 08:55:28 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: David Hildenbrand <david@...hat.com>
Cc: Hugh Dickins <hughd@...gle.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>, akpm@...ux-foundation.org,
ziy@...dia.com, Liam.Howlett@...cle.com, npache@...hat.com,
ryan.roberts@....com, dev.jain@....com, baohua@...nel.org,
zokeefe@...gle.com, shy828301@...il.com, usamaarif642@...il.com,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP settings are
disabled
On Wed, Jun 25, 2025 at 09:34:51AM +0200, David Hildenbrand wrote:
> > >
> > > We would all prefer a less messy world of THP tunables. I certainly
> > > find plenty to dislike there too; and wish that a less assertive name
> > > than "never" had been chosen originally for the default off position.
> > >
> > > But please don't break the accepted and documented behaviour of
> > > MADV_COLLAPSE now.
> >
> > Again see above, I absolutely disagree this is documented _clearly_. And
> > that's the underlying issue here.
> > > I feel like if you polled 100 system administrators (assuming they knew
> > about THP) as to how you globally disable THP, probably all 100 would say
> > you do it via:
> >
> > # echo never > /sys/kernel/mm/transparent_hugepage/enabled
> >
>
> Yes. One big problem is that the documentation was not updated.
>
> Changing the meaning of "entirely disabled" to "entirely disabled
> automatically (page faults, khugepaged)"
I mean, if we decide to drop this series, I will fix this in both man page
and documentation.
It'll be basically heavily underlining the fact that sysfs never DOES NOT
MEAN NEVER it means never _automatically_ as you say.
>
> > So shouldn't 'never break userspace' be based on practical reality rather
> > than a theorised interpretation of documents that sadly are not clear
> > enough?
>
> I think the problem is that there might indeed be more users out there
> relying on "never+MADV_COLLPASE" to now place THPs than
> "never+MADV_COLLPASE" to no place THPs.
>
> What is the harm when not placing THPs? Performance degradation for some
> apps?
Right, but we're already not placing the _right_ kinds of
pages. MADV_COLLAPSE was a pre-mTHP era thing...
>
> What is the harm when placing THPs although disabled? Making the life for
> David for debugging customer issues harder :P
Well Baolin pointed out the case of a system that only enables mTHP for
instance. Now these apps all get PMD pages. That is surprising and also
absolutely reduces availablility of mTHP pages.
This is a very big problem.
I guess this comes down to - do we have users who are _absolutely reliant_
on never + collapse?
And no we do not want to degrade performance. But do such people exist?
I'm guessing Hugh might know ;)
> > While it's useful to have this, prctl() is where APIs go to die. It's a
> > hidden wasteland that nobody knows about, it may as well not exist.
> >
> > We have a whole sysctl directory for configuring this stuff. It's sort of
> > crazy to have that then to have a special prctl() hidden away also...
>
> prctl(PR_SET_THP_DISABLE) is today not really a reasonable way for an admin
> to disable THPs system wide I'm afraid.
Yes.
>
> >
> > >
> > > Add a "deny" option to /sys/kernel/mm/transparent_hugepage/enabled
> > > if you like. (But in these days of filesystem large folios, adding
> > > new protections against them seems a few years late.)
> >
> > Based on a reasonable interpretation of 'never' I would say we retain this
> > series as-is, and 'deny' could be what 'never' was intended to be before.
>
> At least for shmem_enabled never means "unless overwritten per mount through
> huge=" and "deny" means "force off for all mounts". So "deny" is "harder"
This is so... ugh. Lord. My sweet English language... "I will never let you
in here" vs. "I deny you entry", one is a lot stronger, and it's not the
latter one...
Anyway ok given that this is the case, I guess we could use deny for this.
>
> Inverting the semantics here might be causing even more problems.
Yeah you're probably right.
But obviously every distro who defaults to 'never' is going to get a
surprise here.
>
> --
> Cheers,
>
> David / dhildenb
>
I suppose the least awful way of addressing Baolin's concerns re: mTHP
while simultaneosly keeping existing semantics is:
1. Introduce deny to mean what never should have meant.
2. Use something like the logic here to enforce it.
3. Heavily document it (I can do this).
But I still find this yucky based on the fact that nobody will expect any
of this.
But maybe the THP toggles are such a mess that we're past that anyway?...
Powered by blists - more mailing lists