lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f36e64f2-f3d1-407e-862f-ceccc89ac9a8@lucifer.local>
Date: Wed, 25 Jun 2025 09:22:58 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: David Hildenbrand <david@...hat.com>
Cc: Hugh Dickins <hughd@...gle.com>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>, akpm@...ux-foundation.org,
        ziy@...dia.com, Liam.Howlett@...cle.com, npache@...hat.com,
        ryan.roberts@....com, dev.jain@....com, baohua@...nel.org,
        zokeefe@...gle.com, shy828301@...il.com, usamaarif642@...il.com,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP settings are
 disabled

On Wed, Jun 25, 2025 at 10:16:46AM +0200, David Hildenbrand wrote:
> On 25.06.25 09:49, David Hildenbrand wrote:
> > I think the whole use case of using MADV_COLLAPSE to completely control
> > THP allocation in a system is otherwise pretty hard to achieve, if there
> > is no other way to tame THP allocation through page faults+khugepaged.
>
> Just want to add: for an app itself, it's doable in "madvise" mode perfectly
> fine.
>
> If your app does a MADV_HUGEPAGE, it can get a THP during page-fault +
> khugepaged.
>
> If your app does not do a MADV_HUGEPAGE, it can get a THP through
> MADV_COLLAPSE.
>
> So the "madvise" mode actually works.

Right, but for me MADV_COLLAPSE is more about 'I want THPs _now_ (if available),
not when khugepaged decides to give me some'.

So we have multiple semantics at work here, unfortunately.

>
> The problem appears as soon as we want to control other processes that might
> be setting MADV_HUGEPAGE, and we actually want to control the behavior using
> process_madvise(MADV_COLLAPSE), to say "well, the MADV_HUGEPAGE" should be
> ignored.

This is a _very_ specialist use.

I'd argue for a 'manual' mode to be added to sysfs to cover this case, with
'never' having the 'actually means never' semantics.

You might argue that could confuse things, but it'd retain the 'de facto'
understanding nearly everybody has about what thees flags mean, but give
whatever user is out there that needs this the ability to continue doing what
they want.

And we get into philosophy about not 'breaking' userland, not sure we have a
TLB/page fault/folio allocation efficiency contract with userland :)

No program will break with this patch applied. Just potentially get performance
degradation in a very, very specialist case.

>
> Then, you configure "never" system-wide and use
> process_madvise(MADV_COLLAPSE) to drive it all manually.
>
> Curious to learn if there is such a user out there.

Oh me too :)

>
> --
> Cheers,
>
> David / dhildenb
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ