lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <645178fd-df4e-42fe-b55e-97d9506499be@arm.com>
Date: Fri, 7 Nov 2025 15:22:54 +0000
From: Ryan Roberts <ryan.roberts@....com>
To: "David Hildenbrand (Red Hat)" <davidhildenbrandkernel@...il.com>,
 Kevin Brodsky <kevin.brodsky@....com>, linux-mm@...ck.org
Cc: linux-kernel@...r.kernel.org, Alexander Gordeev <agordeev@...ux.ibm.com>,
 Andreas Larsson <andreas@...sler.com>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Boris Ostrovsky <boris.ostrovsky@...cle.com>, Borislav Petkov
 <bp@...en8.de>, Catalin Marinas <catalin.marinas@....com>,
 Christophe Leroy <christophe.leroy@...roup.eu>,
 Dave Hansen <dave.hansen@...ux.intel.com>,
 "David S. Miller" <davem@...emloft.net>,
 David Woodhouse <dwmw2@...radead.org>, "H. Peter Anvin" <hpa@...or.com>,
 Ingo Molnar <mingo@...hat.com>, Jann Horn <jannh@...gle.com>,
 Juergen Gross <jgross@...e.com>, "Liam R. Howlett"
 <Liam.Howlett@...cle.com>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 Madhavan Srinivasan <maddy@...ux.ibm.com>,
 Michael Ellerman <mpe@...erman.id.au>, Michal Hocko <mhocko@...e.com>,
 Mike Rapoport <rppt@...nel.org>, Nicholas Piggin <npiggin@...il.com>,
 Peter Zijlstra <peterz@...radead.org>, Suren Baghdasaryan
 <surenb@...gle.com>, Thomas Gleixner <tglx@...utronix.de>,
 Vlastimil Babka <vbabka@...e.cz>, Will Deacon <will@...nel.org>,
 Yeoreum Yun <yeoreum.yun@....com>, linux-arm-kernel@...ts.infradead.org,
 linuxppc-dev@...ts.ozlabs.org, sparclinux@...r.kernel.org,
 xen-devel@...ts.xenproject.org, x86@...nel.org
Subject: Re: [PATCH v4 06/12] mm: introduce generic lazy_mmu helpers

On 07/11/2025 14:34, David Hildenbrand (Red Hat) wrote:
>>>   #ifndef pte_batch_hint
>>> diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
>>> index 5d2a876035d6..c49b029d3593 100644
>>> --- a/mm/kasan/shadow.c
>>> +++ b/mm/kasan/shadow.c
>>> @@ -305,7 +305,7 @@ static int kasan_populate_vmalloc_pte(pte_t *ptep,
>>> unsigned long addr,
>>>       pte_t pte;
>>>       int index;
>>>   -    arch_leave_lazy_mmu_mode();
>>> +    lazy_mmu_mode_pause();
>>
>> I wonder if there really are use cases that *require* pause/resume? I think
>> these kasan cases could be correctly implemented using a new nest level instead?
>> Are there cases where the effects really need to be immediate or do the effects
>> just need to be visible when you get to where the resume is?
>>
>> If the latter, that could just be turned into a nested disable (e.g. a flush).
>> In this case, there is only 1 PTE write so no benefit, but I wonder if other
>> cases may have more PTE writes that could then still be batched. It would be
>> nice to simplify the API by removing pause/resume if we can?
> 
> It has clear semantics, clearer than some nest-disable IMHO.
> 
> Maybe you can elaborate how you would change ("simplify") the API in that
> regard? What would the API look like?

By simplify, I just meant can we remove lazy_mmu_mode_pause() and
lazy_mmu_mode_resume() ?


We currently have:

apply_to_page_range
  lazy_mmu_mode_enable()
    kasan_populate_vmalloc_pte()
      lazy_mmu_mode_pause()
      <code>
      lazy_mmu_mode_resume()
  lazy_mmu_mode_disable()

Where <code> is setting ptes. But if <code> doesn't need the effects to be
visible until lazy_mmu_mode_resume(), then you could replace the block with:

apply_to_page_range
  lazy_mmu_mode_enable()
    kasan_populate_vmalloc_pte()
      lazy_mmu_mode_enable()
      <code>
      lazy_mmu_mode_disable()
  lazy_mmu_mode_disable()

However, looking at this more closely, I'm not really clear on why we need *any*
special attention to lazy mmu inside of kasan_populate_vmalloc_pte() and
kasan_depopulate_vmalloc_pte().

I *think* that the original concern was that we were doing ptep_get(ptep) inside
of a lazy_mmu block? So we need to flush so that the getter returns the most
recent value? But given we have never written to that particular ptep while in
the lazy mmu block, there is surely no hazard in the first place?

apply_to_existing_page_range() will only call kasan_depopulate_vmalloc_pte()
once per pte, right? So given we read the ptep before writing it, there should
be no hazard? If so we can remove pause/resume.

Thanks,
Ryan


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ