lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c071c478-c6c3-47c4-a504-b1fa650d528f@lucifer.local>
Date: Tue, 22 Jul 2025 06:29:15 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Barry Song <21cnbao@...il.com>
Cc: Baolin Wang <baolin.wang@...ux.alibaba.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        David Hildenbrand <david@...hat.com>, Zi Yan <ziy@...dia.com>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
        Dev Jain <dev.jain@....com>, Jonathan Corbet <corbet@....net>,
        linux-mm@...ck.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, Hugh Dickins <hughd@...gle.com>
Subject: Re: [PATCH] docs: update THP documentation to clarify sysfs "never"
 setting

+cc Hugh since we're mentioning him here, and not-trimming for context -
TL;DR I am updating the docs to reflect the sysfs never 'doesn't mean
never' behaviour for THP.

On Tue, Jul 22, 2025 at 11:37:07AM +0800, Barry Song wrote:
> On Tue, Jul 22, 2025 at 10:33 AM Baolin Wang
> <baolin.wang@...ux.alibaba.com> wrote:
> >
> >
> >
> > On 2025/7/22 10:23, Barry Song wrote:
> > > On Tue, Jul 22, 2025 at 9:30 AM Baolin Wang
> > > <baolin.wang@...ux.alibaba.com> wrote:
> > >>
> > >>
> > >>
> > >> On 2025/7/21 23:55, Lorenzo Stoakes wrote:
> > >>> Rather confusingly, setting all Transparent Huge Page sysfs settings to
> > >>> "never" does not in fact result in THP being globally disabled.
> > >>>
> > >>> Rather, it results in khugepaged being disabled, but one can still obtain
> > >>> THP pages using madvise(..., MADV_COLLAPSE).
> > >>>
> > >>> This is something that has remained poorly documented for some time, and it
> > >>> is likely the received wisdom of most users of THP that never does, in
> > >>> fact, mean never.
> > >>>
> > >>> It is therefore important to highlight, very clearly, that this is not the
> > >>> ase.
> > >>>
> > >>> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> > >>> ---
> > >>>    Documentation/admin-guide/mm/transhuge.rst | 11 +++++++++--
> > >>>    1 file changed, 9 insertions(+), 2 deletions(-)
> > >>>
> > >>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
> > >>> index dff8d5985f0f..182519197ef7 100644
> > >>> --- a/Documentation/admin-guide/mm/transhuge.rst
> > >>> +++ b/Documentation/admin-guide/mm/transhuge.rst
> > >>> @@ -107,7 +107,7 @@ sysfs
> > >>>    Global THP controls
> > >>>    -------------------
> > >>>
> > >>> -Transparent Hugepage Support for anonymous memory can be entirely disabled
> > >>> +Transparent Hugepage Support for anonymous memory can be disabled
> > >>>    (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE
> > >>>    regions (to avoid the risk of consuming more memory resources) or enabled
> > >>>    system wide. This can be achieved per-supported-THP-size with one of::
> > >>> @@ -119,6 +119,11 @@ system wide. This can be achieved per-supported-THP-size with one of::
> > >>>    where <size> is the hugepage size being addressed, the available sizes
> > >>>    for which vary by system.
> > >>>
> > >>> +.. note:: Setting "never" in all sysfs THP controls does **not** disable
> > >>> +          Transparent Huge Pages globally. This is because ``madvise(...,
> > >>> +          MADV_COLLAPSE)`` ignores these settings and collapses ranges to
> > >>> +          PMD-sized huge pages unconditionally.
> > >>> +
> > >>>    For example::
> > >>>
> > >>>        echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
> > >>> @@ -187,7 +192,9 @@ madvise
> > >>>        behaviour.
> > >>>
> > >>>    never
> > >>> -     should be self-explanatory.
> > >>> +     should be self-explanatory. Note that ``madvise(...,
> > >>> +     MADV_COLLAPSE)`` can still cause transparent huge pages to be
> > >>> +     obtained even if this mode is specified everywhere.
> > >>
> > >> I hope this part of the explanation is also copy-pasted into the
> > >> 'Hugepages in tmpfs/shmem' section. Otherwise look good to me. Thanks.
> > >
> > > Apologies if this is a silly question, but regarding this patchset:
> > > https://lore.kernel.org/linux-mm/cover.1750815384.git.baolin.wang@linux.alibaba.com/
> > >
> > > It looks like the intention is to disable hugepages even for
> > > `MADV_COLLAPSE` when the user has set the policy to 'never'. However,
> > > based on Lorenzo's documentation update, it seems we still want to allow
> > > hugepages for `MADV_COLLAPSE` even if 'never' is set?
> > >
> > > Could you clarify what the intended behavior is? It seems we've decided
> > > to keep the existing behavior unchanged—am I understanding that
> > > correctly?
> >
> > Yes, Hugh has already explicitly opposed the current changes to the
> > MADV_COLLAPSE logic[1], although there are still some disagreements that
> > cannot be resolved.
> >
> > At least we reached the consensus to update the documentation to reflect
> > the current sysfs THP control logic first, to avoid the misunderstanding
> > that 'sysfs THP controls can disable Transparent Huge Pages globally'.
>
> Nice, thanks! Personally, I prefer this approach as well. Updating the
> man page feels a bit odd, since it's something people are already
> familiar with and may have memorized.

Indeed, Hugh's input was important here and gave pause for thought. This
was not an easy decision, and I ended up changing my mind from initially
supporting this chnage... :)

We may return to it later, but for the time being this is the rather
conservative approach we've decided upon.

Re: man page - I _do_ intend to update the man page as I find it far too
vague on this topic currently, so that patch will be coming soon.

I will cc- the THP folks on that patch when I send it.

>
> >
> > [1]
> > https://lore.kernel.org/linux-mm/75c02dbf-4189-958d-515e-fa80bb2187fc@google.com/
>
> Best regards
> Barry

Cheers, Lorenzo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ