linux-kernel - Re: [PATCH HOTFIX 6.17] mm/mremap: avoid expensive folio lookup on mremap folio pte batch

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0babf2ec-040f-4f7b-aa76-f59e80274333@lucifer.local>
Date: Mon, 11 Aug 2025 16:08:23 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Barry Song <21cnbao@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
        Pedro Falcato <pfalcato@...e.de>, Dev Jain <dev.jain@....com>,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        David Hildenbrand <david@...hat.com>
Subject: Re: [PATCH HOTFIX 6.17] mm/mremap: avoid expensive folio lookup on
 mremap folio pte batch

On Mon, Aug 11, 2025 at 02:52:51PM +0800, Barry Song wrote:
> On Mon, Aug 11, 2025 at 12:57 PM Lorenzo Stoakes
> <lorenzo.stoakes@...cle.com> wrote:
> >
> > On Mon, Aug 11, 2025 at 10:40:50AM +0800, Barry Song wrote:
> > > On Fri, Aug 8, 2025 at 2:59 AM Lorenzo Stoakes
> > > > The expectation by those discussing this from the start was that
> > > > vm_normal_folio() (invoked by mremap_folio_pte_batch()) would likely be the
> > > > culprit due to having to retrieve memory from the vmemmap (which mremap()
> > > > page table moves does not otherwise do, meaning this is inevitably cold
> > > > memory).
> > >
> > > If vm_normal_folio() is so expensive, does that mean it negates the
> > > benefits that commit f822a9a81a31 (“mm: optimize mremap() by PTE
> > > batching”) was originally intended to achieve through PTE batching?
> >
> > Not for arm64 apparently. And the hint check introduces here should avoid
> > regressions even there when small folios are in place.
>
> I still don’t understand why this is fine on arm64. We do have faster
> folio_pte_batch(), get_and_clear_ptes(), and set_ptes() with contpte, but
> are those benefits really enough to outweigh the disadvantage of
> vm_normal_folio(), given those PTEs are likely in the same cacheline?

Well in operations that already need a folio it's not really an extra cost.

For mremap() where we don't, then note given that we're gating on the hint now,
we'd have to have cont PTE entries, and this would mean we're only looking up
the folio every 2, 3 or 4 PTE entries, not for each and every one.

So this is a significant reduction in time taken in theory.

In practice - well I'll let Dev handle that :)

>
> Unless the previous contpte_try_unfold() was very costly and removing it yielded
> a significant improvement, it’s difficult to see how the benefits would outweigh
> the drawbacks of vm_normal_folio(). Does this imply that there was already a
> regression in mremap() caused by contpte_try_unfold() before?
> And that Dev’s patch is essentially a fix for this regression on arm64?

Yeah maybe, and that'd be interesting - Dev/Ryan?

>
> Sorry, maybe I’m talking too much, but I’m curious about the whole story:-)

No please always query things, it's important stuff!

Cheers, Lorenzo