linux-kernel - Re: [PATCH v4 3/4] mm: Optimize mprotect() by PTE-batching

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <64c2ec85-87ed-4793-b0e9-a68eac1a8020@lucifer.local>
Date: Tue, 1 Jul 2025 09:51:26 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Ryan Roberts <ryan.roberts@....com>
Cc: Dev Jain <dev.jain@....com>, akpm@...ux-foundation.org, david@...hat.com,
        willy@...radead.org, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        catalin.marinas@....com, will@...nel.org, Liam.Howlett@...cle.com,
        vbabka@...e.cz, jannh@...gle.com, anshuman.khandual@....com,
        peterx@...hat.com, joey.gouly@....com, ioworker0@...il.com,
        baohua@...nel.org, kevin.brodsky@....com, quic_zhenhuah@...cinc.com,
        christophe.leroy@...roup.eu, yangyicong@...ilicon.com,
        linux-arm-kernel@...ts.infradead.org, hughd@...gle.com,
        yang@...amperecomputing.com, ziy@...dia.com
Subject: Re: [PATCH v4 3/4] mm: Optimize mprotect() by PTE-batching

On Tue, Jul 01, 2025 at 09:30:51AM +0100, Ryan Roberts wrote:
> >> In an ideal world we would flatten and just have mprotect_folio_pte_batch()
> >> return the batch size considering all the relevant PTE bits AND the
> >> AnonExclusive bit on the pages. IIRC one of Dev's earlier versions modified the
> >> core folio_pte_batch() function to also look at the AnonExclusive bit, but I
> >> really disliked changing that core function (I think others did too?).
> >
> > Yeah let's not change the core function.
> >
> > My suggestion is to have mprotect_folio_pte_batch() do this.
> >
> >>
> >> So barring that approach, we are really only left with the batch and sub-batch
> >> approach - although, yes, it could be abstracted more. We could maintain a
> >> context struct that persists across all calls to mprotect_folio_pte_batch() and
> >> it can use that to keep it's state to remember if we are in the middle of a
> >> sub-batch and decide either to call folio_pte_batch() to get a new batch, or
> >> call anon_exclusive_batch() to get the next sub-batch within the current batch.
> >> But that started to feel overly abstracted to me.
> >
> > Having this nested batch/sub-batch loop really feels worse. You just get lost in
> > the complexity here very easily.
> >
> > But i"m also not sure we need to maintain _that_ much state?
> >
> > We're already looping over all of the PTEs here, so abstracting _the entire
> > loop_ and all the sub-batch stuff to another function, that is
> > mprotect_folio_pte_batch() I think sensibly, so it handles this for you makes a
> > ton of sense.
>
> So effectively turn mprotect_folio_pte_batch() into an iterator; have a struct
> and a funtion to init the struct for the the number of ptes we want to iterate
> over, then a per iteration function that progressively returns batches?

Is that really necessary though?

Idea is that mprotect_folio_pte_batch() returns the nr_ptes _taking into account
the PAE stuff_.

Would this break anything?

We might need to pass a flag to say 'don't account for this' for prot numa case.

>
> Then we just have a simple loop here that gets the next batch and processes it?