[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b5b5d823-068f-45d0-90ef-4fa804a84bdd@arm.com>
Date: Wed, 2 Jul 2025 20:31:33 +0530
From: Dev Jain <dev.jain@....com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: akpm@...ux-foundation.org, ryan.roberts@....com, david@...hat.com,
willy@...radead.org, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
catalin.marinas@....com, will@...nel.org, Liam.Howlett@...cle.com,
vbabka@...e.cz, jannh@...gle.com, anshuman.khandual@....com,
peterx@...hat.com, joey.gouly@....com, ioworker0@...il.com,
baohua@...nel.org, kevin.brodsky@....com, quic_zhenhuah@...cinc.com,
christophe.leroy@...roup.eu, yangyicong@...ilicon.com,
linux-arm-kernel@...ts.infradead.org, hughd@...gle.com,
yang@...amperecomputing.com, ziy@...dia.com
Subject: Re: [PATCH v4 1/4] mm: Optimize mprotect() for MM_CP_PROT_NUMA by
batch-skipping PTEs
On 02/07/25 3:07 pm, Lorenzo Stoakes wrote:
> On Sat, Jun 28, 2025 at 05:04:32PM +0530, Dev Jain wrote:
>> In case of prot_numa, there are various cases in which we can skip to the
>> next iteration. Since the skip condition is based on the folio and not
>> the PTEs, we can skip a PTE batch. Additionally refactor all of this
>> into a new function to clean up the existing code.
>>
>> Signed-off-by: Dev Jain <dev.jain@....com>
>> ---
>> mm/mprotect.c | 134 ++++++++++++++++++++++++++++++++------------------
>> 1 file changed, 87 insertions(+), 47 deletions(-)
>>
>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index 88709c01177b..af10a7fbe6b8 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -83,6 +83,83 @@ bool can_change_pte_writable(struct vm_area_struct *vma, unsigned long addr,
>> return pte_dirty(pte);
>> }
>>
>> +static int mprotect_folio_pte_batch(struct folio *folio, unsigned long addr,
>> + pte_t *ptep, pte_t pte, int max_nr_ptes)
>> +{
>> + const fpb_t flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY;
>> +
>> + if (!folio || !folio_test_large(folio) || (max_nr_ptes == 1))
>> + return 1;
>> +
>> + return folio_pte_batch(folio, addr, ptep, pte, max_nr_ptes, flags,
>> + NULL, NULL, NULL);
>> +}
>> +
>> +static int prot_numa_skip_ptes(struct folio **foliop, struct vm_area_struct *vma,
>> + unsigned long addr, pte_t oldpte, pte_t *pte, int target_node,
>> + int max_nr_ptes)
>> +{
> While it's nice to separate this out, it's not so nice to pass folio as a
> pointer to a pointer like this and maybe or maybe not set it.
>
> Just get the folio before you call this... you'll need it either way.
I did that on David's suggestion:
https://lore.kernel.org/all/8c389ee5-f7a4-44f6-a0d6-cc01c3da4d91@redhat.com/
We were trying to reuse the folio if available from prot_numa_skip_ptes,
to avoid using vm_normal_folio() again. Not sure if avoiding vm_normal_folio
is worth the complexity.
>
> I'll wait until you separate it all out before reviewing next revision as a bit
> tricky as-is.
>
> Thanks!
Powered by blists - more mailing lists