lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a6c3a877-443a-45d5-ade4-bf0af0fb7de9@lucifer.local>
Date: Fri, 1 Aug 2025 11:32:27 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: David Hildenbrand <david@...hat.com>
Cc: Usama Arif <usamaarif642@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org, corbet@....net, rppt@...nel.org,
        surenb@...gle.com, mhocko@...e.com, hannes@...xchg.org,
        baohua@...nel.org, shakeel.butt@...ux.dev, riel@...riel.com,
        ziy@...dia.com, laoar.shao@...il.com, dev.jain@....com,
        baolin.wang@...ux.alibaba.com, npache@...hat.com,
        Liam.Howlett@...cle.com, ryan.roberts@....com, vbabka@...e.cz,
        jannh@...gle.com, Arnd Bergmann <arnd@...db.de>, sj@...nel.org,
        linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
        kernel-team@...a.com
Subject: Re: [PATCH v2 3/5] mm/huge_memory: treat MADV_COLLAPSE as an advise
 with PR_THP_DISABLE_EXCEPT_ADVISED

On Thu, Jul 31, 2025 at 04:54:38PM +0200, David Hildenbrand wrote:
> On 31.07.25 16:38, Lorenzo Stoakes wrote:
> > Nits on subject:
> >
> > - It's >75 chars
>
> No big deal. If we cna come up with something shorter, good.
>
> > - advise is the verb, advice is the noun.
>
> Yeah.
>
> >
> > On Thu, Jul 31, 2025 at 01:27:20PM +0100, Usama Arif wrote:
> > > From: David Hildenbrand <david@...hat.com>
> > >
> > > Let's allow for making MADV_COLLAPSE succeed on areas that neither have
> > > VM_HUGEPAGE nor VM_NOHUGEPAGE when we have THP disabled
> > > unless explicitly advised (PR_THP_DISABLE_EXCEPT_ADVISED).
> >
> > Hmm, I'm not sure about this.
> >
> > So far this prctl() has been the only way to override MADV_COLLAPSE
> > behaviour, but now we're allowing for this one case to not.
>
> This is not an override really. prctl() disallowed MADV_COLLAPSE, but in the
> new mode we don't want that anymore.

Yeah see below, I sort of convinced myself this is OK.

>
> > > I suppose the precedent is that MADV_COLLAPSE overrides 'madvise' sysfs
> > behaviour.
> > > I suppose what saves us here is 'advised' can be read to mean either
> > MADV_HUGEPAGE or MADV_COLLAPSE.
> > > And yes, MADV_COLLAPSE is clearly the user requesting this behaviour.
>
> Exactly.

And really I guess it's logical right? If you MADV_COLLAPSE you are saying
'I want THPs', if you MADV_HUGEPAGE you are saying 'I want THPs', the
difference being on timing.

>
> >
> > I think the vagueness here is one that already existed, because one could
> > perfectly one have expected MADV_COLLAPSE to obey sysfs and require
> > MADV_HUGEPAGE to have been applied, but of course this is not the case.
>
> Yes.
>
> >
> > OK so fine.
> >
> > BUT.
> >
> > I think the MADV_COLLAPSE man page will need to be updated to mention this.
> >
>
> Yes.
>
> > And I REALLY think we should update the THP doc too to mention all these
> > prctl() modes.
> >
> > I'm not sure we cover that right now _at all_ and obviously we should
> > describe the new flags.
> >
> > Usama - can you add a patch to this series to do that?
>
> Good point, let's document the interaction with prctl().

Thanks, I think we haven't even spoken that much about MADV_COLLAPSE in the
docs, somebody had a patch but then it fizzled out.

But that's a separate task, will add to my TODOs in case nobody else picks
up.

>
> >
> > >
> > > MADV_COLLAPSE is a clear advise that we want to collapse.
> >
> > advise -> advice.
> >
> > >
> > > Note that we still respect the VM_NOHUGEPAGE flag, just like
> > > MADV_COLLAPSE always does. So consequently, MADV_COLLAPSE is now only
> > > refused on VM_NOHUGEPAGE with PR_THP_DISABLE_EXCEPT_ADVISED.
> >
> > You also need to mention the shmem change you've made I think.
>
> Yes.
>
> > >>
> > > Co-developed-by: Usama Arif <usamaarif642@...il.com>
> > > Signed-off-by: Usama Arif <usamaarif642@...il.com>
> > > Signed-off-by: David Hildenbrand <david@...hat.com>
> > > ---
> > >   include/linux/huge_mm.h    | 8 +++++++-
> > >   include/uapi/linux/prctl.h | 2 +-
> > >   mm/huge_memory.c           | 5 +++--
> > >   mm/memory.c                | 6 ++++--
> > >   mm/shmem.c                 | 2 +-
> > >   5 files changed, 16 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> > > index b0ff54eee81c..aeaf93f8ac2e 100644
> > > --- a/include/linux/huge_mm.h
> > > +++ b/include/linux/huge_mm.h
> > > @@ -329,7 +329,7 @@ struct thpsize {
> > >    * through madvise or prctl.
> > >    */
> > >   static inline bool vma_thp_disabled(struct vm_area_struct *vma,
> > > -		vm_flags_t vm_flags)
> > > +		vm_flags_t vm_flags, bool forced_collapse)
> > >   {
> > >   	/* Are THPs disabled for this VMA? */
> > >   	if (vm_flags & VM_NOHUGEPAGE)
> > > @@ -343,6 +343,12 @@ static inline bool vma_thp_disabled(struct vm_area_struct *vma,
> > >   	 */
> > >   	if (vm_flags & VM_HUGEPAGE)
> > >   		return false;
> > > +	/*
> > > +	 * Forcing a collapse (e.g., madv_collapse), is a clear advise to
> >
> > advise -> advice.
> >
> > > +	 * use THPs.
> > > +	 */
> > > +	if (forced_collapse)
> > > +		return false;
> > >   	return test_bit(MMF_DISABLE_THP_EXCEPT_ADVISED, &vma->vm_mm->flags);
> > >   }
> > >
> > > diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> > > index 9c1d6e49b8a9..ee4165738779 100644
> > > --- a/include/uapi/linux/prctl.h
> > > +++ b/include/uapi/linux/prctl.h
> > > @@ -185,7 +185,7 @@ struct prctl_mm_map {
> > >   #define PR_SET_THP_DISABLE	41
> > >   /*
> > >    * Don't disable THPs when explicitly advised (e.g., MADV_HUGEPAGE /
> > > - * VM_HUGEPAGE).
> > > + * VM_HUGEPAGE / MADV_COLLAPSE).
> >
> > This is confusing you're mixing VMA flags with MADV ones... maybe just
> > stick to madvise ones, or add extra context around VM_HUGEPAGE bit?
>
> I don't see anything confusing here, really.
>
> But if it helps you, we can do
> 	(e.g., MADV_HUGEPAGE / VM_HUGEPAGE, MADV_COLLAPSE).
>
> (reason VM_HUGEPAGE is spelled out is that there might be code where we set
> VM_HUGEPAGE implicitly in the kernel)

Yeah to be clear this is a pretty minor point, and I see why we'd want to
mention VM_HUGEPAGE explicitly, as it is in fact this that overrides THP
being disabled for the process.

>
> >
> > Would need to be fixed up in a prior commit obviously.
> >
> > >    */
> > >   # define PR_THP_DISABLE_EXCEPT_ADVISED	(1 << 1)
> > >   #define PR_GET_THP_DISABLE	42
> > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > > index 85252b468f80..ef5ccb0ec5d5 100644
> > > --- a/mm/huge_memory.c
> > > +++ b/mm/huge_memory.c
> > > @@ -104,7 +104,8 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
> > >   {
> > >   	const bool smaps = type == TVA_SMAPS;
> > >   	const bool in_pf = type == TVA_PAGEFAULT;
> > > -	const bool enforce_sysfs = type != TVA_FORCED_COLLAPSE;
> > > +	const bool forced_collapse = type == TVA_FORCED_COLLAPSE;
> > > +	const bool enforce_sysfs = !forced_collapse;
> >
> > Can we just get rid of this enforce_sysfs altogether in patch 2/5 and use
> > forced_collapse?
>
> Let's do that as a separate cleanup on top. I want least churn in that
> patch.
>
> (had the same idea while writing that patch, but I have other things to
> focus on than cleaning up all this mess)

Ack, I'm fine with that.

>
> --
> Cheers,
>
> David / dhildenb
>

Cheers, Lorenzo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ