[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d968a24c-7e2c-40ca-a290-00b68222ac1c@arm.com>
Date: Sat, 28 Jun 2025 18:09:46 +0530
From: Dev Jain <dev.jain@....com>
To: akpm@...ux-foundation.org
Cc: ryan.roberts@....com, david@...hat.com, willy@...radead.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org, catalin.marinas@....com,
will@...nel.org, Liam.Howlett@...cle.com, lorenzo.stoakes@...cle.com,
vbabka@...e.cz, jannh@...gle.com, anshuman.khandual@....com,
peterx@...hat.com, joey.gouly@....com, ioworker0@...il.com,
baohua@...nel.org, kevin.brodsky@....com, quic_zhenhuah@...cinc.com,
christophe.leroy@...roup.eu, yangyicong@...ilicon.com,
linux-arm-kernel@...ts.infradead.org, hughd@...gle.com,
yang@...amperecomputing.com, ziy@...dia.com
Subject: Re: [PATCH v4 3/4] mm: Optimize mprotect() by PTE-batching
On 28/06/25 5:04 pm, Dev Jain wrote:
> Use folio_pte_batch to batch process a large folio. Reuse the folio from
> prot_numa case if possible.
>
> For all cases other than the PageAnonExclusive case, if the case holds true
> for one pte in the batch, one can confirm that that case will hold true for
> other ptes in the batch too; for pte_needs_soft_dirty_wp(), we do not pass
> FPB_IGNORE_SOFT_DIRTY. modify_prot_start_ptes() collects the dirty
> and access bits across the batch, therefore batching across
> pte_dirty(): this is correct since the dirty bit on the PTE really is
> just an indication that the folio got written to, so even if the PTE is
> not actually dirty (but one of the PTEs in the batch is), the wp-fault
> optimization can be made.
>
> The crux now is how to batch around the PageAnonExclusive case; we must
> check the corresponding condition for every single page. Therefore, from
> the large folio batch, we process sub batches of ptes mapping pages with
> the same PageAnonExclusive condition, and process that sub batch, then
> determine and process the next sub batch, and so on. Note that this does
> not cause any extra overhead; if suppose the size of the folio batch
> is 512, then the sub batch processing in total will take 512 iterations,
> which is the same as what we would have done before.
>
> Signed-off-by: Dev Jain <dev.jain@....com>
> ---
>
Forgot to add:
Co-developed-by: Ryan Roberts <ryan.roberts@....com>
Signed-off-by: Ryan Roberts <ryan.roberts@....com>
as this patch is almost identical to the diff Ryan had suggested.
Powered by blists - more mailing lists