[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <df0475e1-9078-4629-b23d-0919ab1e37c2@arm.com>
Date: Thu, 25 Apr 2024 11:29:07 +0100
From: Ryan Roberts <ryan.roberts@....com>
To: David Hildenbrand <david@...hat.com>,
Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
Joey Gouly <joey.gouly@....com>, Ard Biesheuvel <ardb@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Anshuman Khandual <anshuman.khandual@....com>, Peter Xu <peterx@...hat.com>,
Mike Rapoport <rppt@...ux.ibm.com>, Shivansh Vij <shivanshvij@...look.com>
Cc: linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 1/2] arm64/mm: Move PTE_PROT_NONE and
PMD_PRESENT_INVALID
On 25/04/2024 10:16, David Hildenbrand wrote:
> On 24.04.24 13:10, Ryan Roberts wrote:
>> Previously PTE_PROT_NONE was occupying bit 58, one of the bits reserved
>> for SW use when the PTE is valid. This is a waste of those precious SW
>> bits since PTE_PROT_NONE can only ever be set when valid is clear.
>> Instead let's overlay it on what would be a HW bit if valid was set.
>>
>> We need to be careful about which HW bit to choose since some of them
>> must be preserved; when pte_present() is true (as it is for a
>> PTE_PROT_NONE pte), it is legitimate for the core to call various
>> accessors, e.g. pte_dirty(), pte_write() etc. There are also some
>> accessors that are private to the arch which must continue to be
>> honoured, e.g. pte_user(), pte_user_exec() etc.
>>
>> So we choose to overlay PTE_UXN; This effectively means that whenever a
>> pte has PTE_PROT_NONE set, it will always report pte_user_exec() ==
>> false, which is obviously always correct.
>>
>> As a result of this change, we must shuffle the layout of the
>> arch-specific swap pte so that PTE_PROT_NONE is always zero and not
>> overlapping with any other field. As a result of this, there is no way
>> to keep the `type` field contiguous without conflicting with
>> PMD_PRESENT_INVALID (bit 59), which must also be 0 for a swap pte. So
>> let's move PMD_PRESENT_INVALID to bit 60.
>
> A note that some archs split/re-combine type and/or offset, to make use of every
> bit possible :) But that's mostly relevant for 32bit.
>
> (and as long as PFNs can still fit into the swp offset for migration entries etc.)
Yeah, I considered splitting the type or offset field to avoid moving
PMD_PRESENT_INVALID, but thought it was better to avoid the extra mask and shift.
>
>>
>> In the end, this frees up bit 58 for future use as a proper SW bit (e.g.
>> soft-dirty or uffd-wp).
>
> I was briefly confused about how you would use these bits as SW bits for swap
> PTEs (which you can't as they overlay the type). See below regarding bit 3.
>
> I would have said here "proper SW bit for present PTEs".
Yes; I'll clarify in the next version.
>
>>
>> Signed-off-by: Ryan Roberts <ryan.roberts@....com>
>> ---
>> arch/arm64/include/asm/pgtable-prot.h | 4 ++--
>> arch/arm64/include/asm/pgtable.h | 16 +++++++++-------
>> 2 files changed, 11 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/pgtable-prot.h
>> b/arch/arm64/include/asm/pgtable-prot.h
>> index dd9ee67d1d87..ef952d69fd04 100644
>> --- a/arch/arm64/include/asm/pgtable-prot.h
>> +++ b/arch/arm64/include/asm/pgtable-prot.h
>> @@ -18,14 +18,14 @@
>> #define PTE_DIRTY (_AT(pteval_t, 1) << 55)
>> #define PTE_SPECIAL (_AT(pteval_t, 1) << 56)
>> #define PTE_DEVMAP (_AT(pteval_t, 1) << 57)
>> -#define PTE_PROT_NONE (_AT(pteval_t, 1) << 58) /* only when !PTE_VALID */
>> +#define PTE_PROT_NONE (PTE_UXN) /* Reuse PTE_UXN; only when
>> !PTE_VALID */
>> /*
>> * This bit indicates that the entry is present i.e. pmd_page()
>> * still points to a valid huge page in memory even if the pmd
>> * has been invalidated.
>> */
>> -#define PMD_PRESENT_INVALID (_AT(pteval_t, 1) << 59) /* only when
>> !PMD_SECT_VALID */
>> +#define PMD_PRESENT_INVALID (_AT(pteval_t, 1) << 60) /* only when
>> !PMD_SECT_VALID */
>> #define _PROT_DEFAULT (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
>> #define _PROT_SECT_DEFAULT (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> index afdd56d26ad7..23aabff4fa6f 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -1248,20 +1248,22 @@ static inline pmd_t pmdp_establish(struct
>> vm_area_struct *vma,
>> * Encode and decode a swap entry:
>> * bits 0-1: present (must be zero)
>> * bits 2: remember PG_anon_exclusive
>> - * bits 3-7: swap type
>> - * bits 8-57: swap offset
>> - * bit 58: PTE_PROT_NONE (must be zero)
>
> Reading this patch alone: what happened to bit 3? Please mention that that it
> will be used as a swap pte metadata bit (uffd-wp).
Will do. It's all a bit arbitrary though. I could have put offset in 3-52, and
then 53 would have been spare for uffd-wp. I'm not sure there is any advantage
to either option.
>
>> + * bits 4-53: swap offset
>
> So we'll still have 50bit for the offset, good. We could even use 61-63 if ever
> required to store bigger PFNs.
yep, or more sw bits.
>
> LGTM
>
> Reviewed-by: David Hildenbrand <david@...hat.com>
Thanks!
Powered by blists - more mailing lists