[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <008ec97f-3b33-4b47-a112-9cef8c1d7f58@redhat.com>
Date: Wed, 25 Jun 2025 09:34:51 +0200
From: David Hildenbrand <david@...hat.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Hugh Dickins <hughd@...gle.com>
Cc: Baolin Wang <baolin.wang@...ux.alibaba.com>, akpm@...ux-foundation.org,
ziy@...dia.com, Liam.Howlett@...cle.com, npache@...hat.com,
ryan.roberts@....com, dev.jain@....com, baohua@...nel.org,
zokeefe@...gle.com, shy828301@...il.com, usamaarif642@...il.com,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP settings are
disabled
>>
>> We would all prefer a less messy world of THP tunables. I certainly
>> find plenty to dislike there too; and wish that a less assertive name
>> than "never" had been chosen originally for the default off position.
>>
>> But please don't break the accepted and documented behaviour of
>> MADV_COLLAPSE now.
>
> Again see above, I absolutely disagree this is documented _clearly_. And
> that's the underlying issue here.
> > I feel like if you polled 100 system administrators (assuming they knew
> about THP) as to how you globally disable THP, probably all 100 would say
> you do it via:
>
> # echo never > /sys/kernel/mm/transparent_hugepage/enabled
>
Yes. One big problem is that the documentation was not updated.
Changing the meaning of "entirely disabled" to "entirely disabled
automatically (page faults, khugepaged)"
> So shouldn't 'never break userspace' be based on practical reality rather
> than a theorised interpretation of documents that sadly are not clear
> enough?
I think the problem is that there might indeed be more users out there
relying on "never+MADV_COLLPASE" to now place THPs than
"never+MADV_COLLPASE" to no place THPs.
What is the harm when not placing THPs? Performance degradation for some
apps?
What is the harm when placing THPs although disabled? Making the life
for David for debugging customer issues harder :P
>
>>
>> If you want to exclude all possibility of THPs, then please use the
>> prctl(PR_SET_THP_DISABLE); or shmem_enabled=deny (I think it was me
>> who insisted that be respected by MADV_COLLAPSE back then).
>
> While it's useful to have this, prctl() is where APIs go to die. It's a
> hidden wasteland that nobody knows about, it may as well not exist.
>
> We have a whole sysctl directory for configuring this stuff. It's sort of
> crazy to have that then to have a special prctl() hidden away also...
prctl(PR_SET_THP_DISABLE) is today not really a reasonable way for an
admin to disable THPs system wide I'm afraid.
>
>>
>> Add a "deny" option to /sys/kernel/mm/transparent_hugepage/enabled
>> if you like. (But in these days of filesystem large folios, adding
>> new protections against them seems a few years late.)
>
> Based on a reasonable interpretation of 'never' I would say we retain this
> series as-is, and 'deny' could be what 'never' was intended to be before.
At least for shmem_enabled never means "unless overwritten per mount
through huge=" and "deny" means "force off for all mounts". So "deny" is
"harder"
Inverting the semantics here might be causing even more problems.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists