[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b55558a5-a9d4-4aea-956a-1babad01b6cd@redhat.com>
Date: Thu, 25 Apr 2024 11:16:43 +0200
From: David Hildenbrand <david@...hat.com>
To: Ryan Roberts <ryan.roberts@....com>,
Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
Joey Gouly <joey.gouly@....com>, Ard Biesheuvel <ardb@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Anshuman Khandual <anshuman.khandual@....com>, Peter Xu <peterx@...hat.com>,
Mike Rapoport <rppt@...ux.ibm.com>, Shivansh Vij <shivanshvij@...look.com>
Cc: linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 1/2] arm64/mm: Move PTE_PROT_NONE and
PMD_PRESENT_INVALID
On 24.04.24 13:10, Ryan Roberts wrote:
> Previously PTE_PROT_NONE was occupying bit 58, one of the bits reserved
> for SW use when the PTE is valid. This is a waste of those precious SW
> bits since PTE_PROT_NONE can only ever be set when valid is clear.
> Instead let's overlay it on what would be a HW bit if valid was set.
>
> We need to be careful about which HW bit to choose since some of them
> must be preserved; when pte_present() is true (as it is for a
> PTE_PROT_NONE pte), it is legitimate for the core to call various
> accessors, e.g. pte_dirty(), pte_write() etc. There are also some
> accessors that are private to the arch which must continue to be
> honoured, e.g. pte_user(), pte_user_exec() etc.
>
> So we choose to overlay PTE_UXN; This effectively means that whenever a
> pte has PTE_PROT_NONE set, it will always report pte_user_exec() ==
> false, which is obviously always correct.
>
> As a result of this change, we must shuffle the layout of the
> arch-specific swap pte so that PTE_PROT_NONE is always zero and not
> overlapping with any other field. As a result of this, there is no way
> to keep the `type` field contiguous without conflicting with
> PMD_PRESENT_INVALID (bit 59), which must also be 0 for a swap pte. So
> let's move PMD_PRESENT_INVALID to bit 60.
A note that some archs split/re-combine type and/or offset, to make use
of every bit possible :) But that's mostly relevant for 32bit.
(and as long as PFNs can still fit into the swp offset for migration
entries etc.)
>
> In the end, this frees up bit 58 for future use as a proper SW bit (e.g.
> soft-dirty or uffd-wp).
I was briefly confused about how you would use these bits as SW bits for
swap PTEs (which you can't as they overlay the type). See below
regarding bit 3.
I would have said here "proper SW bit for present PTEs".
>
> Signed-off-by: Ryan Roberts <ryan.roberts@....com>
> ---
> arch/arm64/include/asm/pgtable-prot.h | 4 ++--
> arch/arm64/include/asm/pgtable.h | 16 +++++++++-------
> 2 files changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
> index dd9ee67d1d87..ef952d69fd04 100644
> --- a/arch/arm64/include/asm/pgtable-prot.h
> +++ b/arch/arm64/include/asm/pgtable-prot.h
> @@ -18,14 +18,14 @@
> #define PTE_DIRTY (_AT(pteval_t, 1) << 55)
> #define PTE_SPECIAL (_AT(pteval_t, 1) << 56)
> #define PTE_DEVMAP (_AT(pteval_t, 1) << 57)
> -#define PTE_PROT_NONE (_AT(pteval_t, 1) << 58) /* only when !PTE_VALID */
> +#define PTE_PROT_NONE (PTE_UXN) /* Reuse PTE_UXN; only when !PTE_VALID */
>
> /*
> * This bit indicates that the entry is present i.e. pmd_page()
> * still points to a valid huge page in memory even if the pmd
> * has been invalidated.
> */
> -#define PMD_PRESENT_INVALID (_AT(pteval_t, 1) << 59) /* only when !PMD_SECT_VALID */
> +#define PMD_PRESENT_INVALID (_AT(pteval_t, 1) << 60) /* only when !PMD_SECT_VALID */
>
> #define _PROT_DEFAULT (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
> #define _PROT_SECT_DEFAULT (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index afdd56d26ad7..23aabff4fa6f 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -1248,20 +1248,22 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
> * Encode and decode a swap entry:
> * bits 0-1: present (must be zero)
> * bits 2: remember PG_anon_exclusive
> - * bits 3-7: swap type
> - * bits 8-57: swap offset
> - * bit 58: PTE_PROT_NONE (must be zero)
Reading this patch alone: what happened to bit 3? Please mention that
that it will be used as a swap pte metadata bit (uffd-wp).
> + * bits 4-53: swap offset
So we'll still have 50bit for the offset, good. We could even use 61-63
if ever required to store bigger PFNs.
LGTM
Reviewed-by: David Hildenbrand <david@...hat.com>
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists