lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 12 Feb 2024 17:27:59 +0100
From: David Hildenbrand <david@...hat.com>
To: Ryan Roberts <ryan.roberts@....com>,
 Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
 Ard Biesheuvel <ardb@...nel.org>, Marc Zyngier <maz@...nel.org>,
 James Morse <james.morse@....com>, Andrey Ryabinin <ryabinin.a.a@...il.com>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Matthew Wilcox <willy@...radead.org>, Mark Rutland <mark.rutland@....com>,
 Kefeng Wang <wangkefeng.wang@...wei.com>, John Hubbard
 <jhubbard@...dia.com>, Zi Yan <ziy@...dia.com>,
 Barry Song <21cnbao@...il.com>, Alistair Popple <apopple@...dia.com>,
 Yang Shi <shy828301@...il.com>, Nicholas Piggin <npiggin@...il.com>,
 Christophe Leroy <christophe.leroy@...roup.eu>,
 "Aneesh Kumar K.V" <aneesh.kumar@...nel.org>,
 "Naveen N. Rao" <naveen.n.rao@...ux.ibm.com>,
 Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
 Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
 "H. Peter Anvin" <hpa@...or.com>
Cc: linux-arm-kernel@...ts.infradead.org, x86@...nel.org,
 linuxppc-dev@...ts.ozlabs.org, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 22/25] mm: Add pte_batch_hint() to reduce scanning in
 folio_pte_batch()

On 12.02.24 16:47, Ryan Roberts wrote:
> On 12/02/2024 13:43, David Hildenbrand wrote:
>> On 02.02.24 09:07, Ryan Roberts wrote:
>>> Some architectures (e.g. arm64) can tell from looking at a pte, if some
>>> follow-on ptes also map contiguous physical memory with the same pgprot.
>>> (for arm64, these are contpte mappings).
>>>
>>> Take advantage of this knowledge to optimize folio_pte_batch() so that
>>> it can skip these ptes when scanning to create a batch. By default, if
>>> an arch does not opt-in, folio_pte_batch() returns a compile-time 1, so
>>> the changes are optimized out and the behaviour is as before.
>>>
>>> arm64 will opt-in to providing this hint in the next patch, which will
>>> greatly reduce the cost of ptep_get() when scanning a range of contptes.
>>>
>>> Tested-by: John Hubbard <jhubbard@...dia.com>
>>> Signed-off-by: Ryan Roberts <ryan.roberts@....com>
>>> ---
>>>    include/linux/pgtable.h | 18 ++++++++++++++++++
>>>    mm/memory.c             | 20 +++++++++++++-------
>>>    2 files changed, 31 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>>> index 50f32cccbd92..cba31f177d27 100644
>>> --- a/include/linux/pgtable.h
>>> +++ b/include/linux/pgtable.h
>>> @@ -212,6 +212,24 @@ static inline int pmd_dirty(pmd_t pmd)
>>>    #define arch_flush_lazy_mmu_mode()    do {} while (0)
>>>    #endif
>>>    +#ifndef pte_batch_hint
>>> +/**
>>> + * pte_batch_hint - Number of pages that can be added to batch without scanning.
>>> + * @ptep: Page table pointer for the entry.
>>> + * @pte: Page table entry.
>>> + *
>>> + * Some architectures know that a set of contiguous ptes all map the same
>>> + * contiguous memory with the same permissions. In this case, it can provide a
>>> + * hint to aid pte batching without the core code needing to scan every pte.
>>
>> I think we might want to document here the expectation regarding
>> dirty/accessed bits. folio_pte_batch() will ignore dirty bits only with
>> FPB_IGNORE_DIRTY. But especially for arm64, it makes sense to ignore them
>> always when batching, because the dirty bit may target any pte part of the
>> cont-pte group either way.
>>
>> Maybe something like:
>>
>> "
>> An architecture implementation may only ignore the PTE accessed and dirty bits.
>> Further, it may only ignore the dirty bit if that bit is already not
>> maintained with precision per PTE inside the hinted batch, and ptep_get()
>> would already have to collect it from various PTEs.
>> "
> 
> I'm proposing to simplify this to:
> 
> "
> An architecture implementation may ignore the PTE accessed state. Further, the
> dirty state must apply atomically to all the PTEs described by the hint.
> "
> 
> Which I think more accurately describes the requirement. Shout if you disagree.

I'm not 100% sure if the "must apply atomically" is clear without all of 
the cont-pte details and ptep_get(). But I fail to describe it in a 
better way.

It's all better compared to what we had before, so LGTM :)

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ