[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <08e3cf90-76f5-4a69-813d-94315943d37c@lucifer.local>
Date: Tue, 22 Jul 2025 06:24:36 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Barry Song <21cnbao@...il.com>
Cc: Baolin Wang <baolin.wang@...ux.alibaba.com>,
Andrew Morton <akpm@...ux-foundation.org>,
David Hildenbrand <david@...hat.com>, Zi Yan <ziy@...dia.com>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
Dev Jain <dev.jain@....com>, Jonathan Corbet <corbet@....net>,
linux-mm@...ck.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] docs: update THP documentation to clarify sysfs "never"
setting
On Tue, Jul 22, 2025 at 10:23:39AM +0800, Barry Song wrote:
> On Tue, Jul 22, 2025 at 9:30 AM Baolin Wang
> <baolin.wang@...ux.alibaba.com> wrote:
> >
> >
> >
> > On 2025/7/21 23:55, Lorenzo Stoakes wrote:
> > > Rather confusingly, setting all Transparent Huge Page sysfs settings to
> > > "never" does not in fact result in THP being globally disabled.
> > >
> > > Rather, it results in khugepaged being disabled, but one can still obtain
> > > THP pages using madvise(..., MADV_COLLAPSE).
> > >
> > > This is something that has remained poorly documented for some time, and it
> > > is likely the received wisdom of most users of THP that never does, in
> > > fact, mean never.
> > >
> > > It is therefore important to highlight, very clearly, that this is not the
> > > ase.
> > >
> > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> > > ---
> > > Documentation/admin-guide/mm/transhuge.rst | 11 +++++++++--
> > > 1 file changed, 9 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
> > > index dff8d5985f0f..182519197ef7 100644
> > > --- a/Documentation/admin-guide/mm/transhuge.rst
> > > +++ b/Documentation/admin-guide/mm/transhuge.rst
> > > @@ -107,7 +107,7 @@ sysfs
> > > Global THP controls
> > > -------------------
> > >
> > > -Transparent Hugepage Support for anonymous memory can be entirely disabled
> > > +Transparent Hugepage Support for anonymous memory can be disabled
> > > (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE
> > > regions (to avoid the risk of consuming more memory resources) or enabled
> > > system wide. This can be achieved per-supported-THP-size with one of::
> > > @@ -119,6 +119,11 @@ system wide. This can be achieved per-supported-THP-size with one of::
> > > where <size> is the hugepage size being addressed, the available sizes
> > > for which vary by system.
> > >
> > > +.. note:: Setting "never" in all sysfs THP controls does **not** disable
> > > + Transparent Huge Pages globally. This is because ``madvise(...,
> > > + MADV_COLLAPSE)`` ignores these settings and collapses ranges to
> > > + PMD-sized huge pages unconditionally.
> > > +
> > > For example::
> > >
> > > echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
> > > @@ -187,7 +192,9 @@ madvise
> > > behaviour.
> > >
> > > never
> > > - should be self-explanatory.
> > > + should be self-explanatory. Note that ``madvise(...,
> > > + MADV_COLLAPSE)`` can still cause transparent huge pages to be
> > > + obtained even if this mode is specified everywhere.
> >
> > I hope this part of the explanation is also copy-pasted into the
> > 'Hugepages in tmpfs/shmem' section. Otherwise look good to me. Thanks.
>
> Apologies if this is a silly question, but regarding this patchset:
> https://lore.kernel.org/linux-mm/cover.1750815384.git.baolin.wang@linux.alibaba.com/
>
> It looks like the intention is to disable hugepages even for
> `MADV_COLLAPSE` when the user has set the policy to 'never'. However,
> based on Lorenzo's documentation update, it seems we still want to allow
> hugepages for `MADV_COLLAPSE` even if 'never' is set?
>
> Could you clarify what the intended behavior is? It seems we've decided
> to keep the existing behavior unchanged—am I understanding that
> correctly?
For now see [0], we have decided at this time that this series should not be
applied.
I again apologise sincerely to Baolin for this being such a back and forth
and him doing so much work here prior to this decision, but overall David
and I felt that _at this time_ we didn't want to risk breaking anybody by
changing this behaviour.
And so as I promised, this patch is my updating the documentation to
reflect the current (and I entirely agree - odd) reality of 'never means
never'.
Cheers, Lorenzo
[0]:https://lore.kernel.org/linux-mm/573eb43a-8536-4206-a7c6-d0daa1fd7e70@lucifer.local/
>
> Thanks
> Barry
Powered by blists - more mailing lists