lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0d52d680-f3d3-454f-8c12-602f650469ab@arm.com>
Date: Wed, 6 Aug 2025 15:07:49 +0530
From: Dev Jain <dev.jain@....com>
To: David Hildenbrand <david@...hat.com>,
 Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: akpm@...ux-foundation.org, ryan.roberts@....com, willy@...radead.org,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org, catalin.marinas@....com,
 will@...nel.org, Liam.Howlett@...cle.com, vbabka@...e.cz, jannh@...gle.com,
 anshuman.khandual@....com, peterx@...hat.com, joey.gouly@....com,
 ioworker0@...il.com, baohua@...nel.org, kevin.brodsky@....com,
 quic_zhenhuah@...cinc.com, christophe.leroy@...roup.eu,
 yangyicong@...ilicon.com, linux-arm-kernel@...ts.infradead.org,
 hughd@...gle.com, yang@...amperecomputing.com, ziy@...dia.com
Subject: Re: [PATCH v5 6/7] mm: Optimize mprotect() by PTE batching


On 06/08/25 2:51 pm, David Hildenbrand wrote:
> On 06.08.25 11:12, Lorenzo Stoakes wrote:
>> On Wed, Aug 06, 2025 at 10:08:33AM +0200, David Hildenbrand wrote:
>>> On 18.07.25 11:02, Dev Jain wrote:
>>>> Signed-off-by: Dev Jain <dev.jain@....com>
>>>
>>>
>>> I wanted to review this, but looks like it's already upstream and I 
>>> suspect
>>> it's buggy (see the upstream report I cc'ed you on)
>>>
>>> [...]
>>>
>>>> +
>>>> +/*
>>>> + * This function is a result of trying our very best to retain the
>>>> + * "avoid the write-fault handler" optimization. In 
>>>> can_change_pte_writable(),
>>>> + * if the vma is a private vma, and we cannot determine whether to 
>>>> change
>>>> + * the pte to writable just from the vma and the pte, we then need 
>>>> to look
>>>> + * at the actual page pointed to by the pte. Unfortunately, if we 
>>>> have a
>>>> + * batch of ptes pointing to consecutive pages of the same anon 
>>>> large folio,
>>>> + * the anon-exclusivity (or the negation) of the first page does 
>>>> not guarantee
>>>> + * the anon-exclusivity (or the negation) of the other pages 
>>>> corresponding to
>>>> + * the pte batch; hence in this case it is incorrect to decide to 
>>>> change or
>>>> + * not change the ptes to writable just by using information from 
>>>> the first
>>>> + * pte of the batch. Therefore, we must individually check all 
>>>> pages and
>>>> + * retrieve sub-batches.
>>>> + */
>>>> +static void commit_anon_folio_batch(struct vm_area_struct *vma,
>>>> +        struct folio *folio, unsigned long addr, pte_t *ptep,
>>>> +        pte_t oldpte, pte_t ptent, int nr_ptes, struct mmu_gather 
>>>> *tlb)
>>>> +{
>>>> +    struct page *first_page = folio_page(folio, 0);
>>>
>>> Who says that we have the first page of the folio mapped into the 
>>> first PTE
>>> of the batch?
>>
>> Yikes, missed this sorry. Got too tied up in alogrithm here.
>>
>> You mean in _this_ PTE of the batch right? As we're invoking these on 
>> each part
>> of the PTE table.
>>
>> I mean I guess we can simply do:
>>
>>     struct page *first_page = pte_page(ptent);
>>
>> Right?
>
> Yes, but we should forward the result from vm_normal_page(), which does
> exactly that for you, and increment the page accordingly as required,
> just like with the pte we are processing.

Makes sense, so I guess I will have to change the signature of 
prot_numa_skip()

to pass a double ptr to a page instead of folio and derive the folio in 
the caller,

and pass down both the folio and the page to 
set_write_prot_commit_flush_ptes.


>
> ...
>
>>>
>>>> +            else
>>>> +                prot_commit_flush_ptes(vma, addr, pte, oldpte, ptent,
>>>> +                    nr_ptes, /* idx = */ 0, /* set_write = */ 
>>>> false, tlb);
>>>
>>> Semi-broken intendation.
>>
>> Because of else then 2 lines after?
>
> prot_commit_flush_ptes(vma, addr, pte, oldpte, ptent,
>                nr_ptes, /* idx = */ 0, /* set_write = */ false, tlb);
>
> Is what I would have expected.
>
>
> I think a smart man once said, that if you need more than one line per 
> statement in
> an if/else clause, a set of {} can aid readability. But I don't 
> particularly care :)
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ