lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a97780ab-6256-43b7-8c0a-80ecbdc3d52d@lucifer.local>
Date: Tue, 28 Oct 2025 18:41:19 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: David Hildenbrand <david@...hat.com>
Cc: Baolin Wang <baolin.wang@...ux.alibaba.com>,
        Nico Pache <npache@...hat.com>, linux-kernel@...r.kernel.org,
        linux-trace-kernel@...r.kernel.org, linux-mm@...ck.org,
        linux-doc@...r.kernel.org, ziy@...dia.com, Liam.Howlett@...cle.com,
        ryan.roberts@....com, dev.jain@....com, corbet@....net,
        rostedt@...dmis.org, mhiramat@...nel.org,
        mathieu.desnoyers@...icios.com, akpm@...ux-foundation.org,
        baohua@...nel.org, willy@...radead.org, peterx@...hat.com,
        wangkefeng.wang@...wei.com, usamaarif642@...il.com,
        sunnanyong@...wei.com, vishal.moola@...il.com,
        thomas.hellstrom@...ux.intel.com, yang@...amperecomputing.com,
        kas@...nel.org, aarcange@...hat.com, raquini@...hat.com,
        anshuman.khandual@....com, catalin.marinas@....com, tiwai@...e.de,
        will@...nel.org, dave.hansen@...ux.intel.com, jack@...e.cz,
        cl@...two.org, jglisse@...gle.com, surenb@...gle.com,
        zokeefe@...gle.com, hannes@...xchg.org, rientjes@...gle.com,
        mhocko@...e.com, rdunlap@...radead.org, hughd@...gle.com,
        richard.weiyang@...il.com, lance.yang@...ux.dev, vbabka@...e.cz,
        rppt@...nel.org, jannh@...gle.com, pfalcato@...e.de
Subject: Re: [PATCH v12 mm-new 06/15] khugepaged: introduce
 collapse_max_ptes_none helper function

On Tue, Oct 28, 2025 at 07:17:16PM +0100, David Hildenbrand wrote:
> On 28.10.25 19:09, Lorenzo Stoakes wrote:
> > (It'd be good if we could keep all the 'solutions' in one thread as I made a
> > detailed reply there and now all that will get lost across two threads but
> > *sigh* never mind. Insert rant about email development here.)
>
> Yeah, I focused in my other mails on things to avoid creep while allowing
> for mTHP collapse.
>
> >
> > On Tue, Oct 28, 2025 at 06:56:10PM +0100, David Hildenbrand wrote:
> > > [...]
> > >
> > > >
> > > > > towards David's earlier simplified approach:
> > > > > 	max_ptes_none == 511 -> collapse mTHP always
> > > > > 	max_ptes_none != 511 -> collapse mTHP only if all PTEs are non-none/zero
> > > >
> > > > Pretty sure David's suggestion was that max_ptes_none would literally get set to
> > > > 511 if you specified 511, or 0 if you specified anything else.
> > >
> > > We had multiple incarnations of this approach, but the first one really was:
> > >
> > > max_ptes_none == 511 -> collapse mTHP always
> >
> > But won't 511 mean we just 'creep' to maximum collapse again? Does that solve
> > anything?
>
> No creep, because you'll always collapse.

OK so in the 511 scenario, do we simply immediately collapse to the largest
possible _mTHP_ page size if based on adjacent none/zero page entries in the
PTE, and _never_ collapse to PMD on this basis even if we do have sufficient
none/zero PTE entries to do so?

And only collapse to PMD size if we have sufficient adjacent PTE entries that
are populated?

Let's really nail this down actually so we can be super clear what the issue is
here.


>
> Creep only happens if you wouldn't collapse a PMD without prior mTHP
> collapse, but suddenly would in the same scenario simply because you had
> prior mTHP collapse.
>
> At least that's my understanding.

OK, that makes sense, is the logic (this may be part of the bit I haven't
reviewed yet tbh) then that for khugepaged mTHP we have the system where we
always require prior mTHP collapse _first_?

>
> >
> > > max_ptes_none == 0 -> collapse mTHP only if all non-none/zero
> > >
> > > And for the intermediate values
> > >
> > > (1) pr_warn() when mTHPs are enabled, stating that mTHP collapse is not
> > > supported yet with other values
> >
> > It feels a bit much to issue a kernel warning every time somebody twiddles that
> > value, and it's kind of against user expectation a bit.
>
> pr_warn_once() is what I meant.

Right, but even then it feels a bit extreme, warnings are pretty serious
things. Then again there's precedent for this, and it may be the least worse
solution.

I just picture a cloud provider turning this on with mTHP then getting their
monitoring team reporting some urgent communication about warnings in dmesg :)

>
> >
> > But maybe it's the least worst way of communicating things. It's still
> > absolutely gross.
> >
> > > (2) treat it like max_ptes_none == 0 or (maybe better?) just disable mTHP
> > > collapse
> >
> > Yeah disabling mTHP collapse for these values seems sane, but it also seems that
> > we should be capping for this to work correctly no?
>
> I didn't get the interaction with capping, can you elaborate?

I think that's addressed in the discussion above, once we clarify the creep
thing then the rest should fall out.

>
> >
> > Also I think all this probably violates requirements of users who want to have
> > different behaviour for mTHP and PMD THP.
> >
> > The default is 511 so we're in creep territory even with the damn default :)
>
> I don't think so, but maybe I am wrong.

Discussed above.

>
>
> --
> Cheers
>
> David / dhildenb
>

Thanks, Lorenzo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ