[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zik2zs0cBeJ_AzED@arm.com>
Date: Wed, 24 Apr 2024 17:43:58 +0100
From: Catalin Marinas <catalin.marinas@....com>
To: Ryan Roberts <ryan.roberts@....com>
Cc: Will Deacon <will@...nel.org>, Joey Gouly <joey.gouly@....com>,
Ard Biesheuvel <ardb@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Anshuman Khandual <anshuman.khandual@....com>,
David Hildenbrand <david@...hat.com>, Peter Xu <peterx@...hat.com>,
Mike Rapoport <rppt@...ux.ibm.com>,
Shivansh Vij <shivanshvij@...look.com>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 1/2] arm64/mm: Move PTE_PROT_NONE and
PMD_PRESENT_INVALID
On Wed, Apr 24, 2024 at 12:10:16PM +0100, Ryan Roberts wrote:
> Previously PTE_PROT_NONE was occupying bit 58, one of the bits reserved
> for SW use when the PTE is valid. This is a waste of those precious SW
> bits since PTE_PROT_NONE can only ever be set when valid is clear.
> Instead let's overlay it on what would be a HW bit if valid was set.
>
> We need to be careful about which HW bit to choose since some of them
> must be preserved; when pte_present() is true (as it is for a
> PTE_PROT_NONE pte), it is legitimate for the core to call various
> accessors, e.g. pte_dirty(), pte_write() etc. There are also some
> accessors that are private to the arch which must continue to be
> honoured, e.g. pte_user(), pte_user_exec() etc.
>
> So we choose to overlay PTE_UXN; This effectively means that whenever a
> pte has PTE_PROT_NONE set, it will always report pte_user_exec() ==
> false, which is obviously always correct.
>
> As a result of this change, we must shuffle the layout of the
> arch-specific swap pte so that PTE_PROT_NONE is always zero and not
> overlapping with any other field. As a result of this, there is no way
> to keep the `type` field contiguous without conflicting with
> PMD_PRESENT_INVALID (bit 59), which must also be 0 for a swap pte. So
> let's move PMD_PRESENT_INVALID to bit 60.
I think we discussed but forgot the details. What was the reason for not
using, say, bit 60 for PTE_PROT_NONE to avoid all the swap bits
reshuffling? Clearing or setting of the PTE_PROT_NONE bit is done via
pte_modify() and this gets all the new permission bits anyway. With POE
support (on the list for now), PTE_PROT_NONE would overlap with
POIndex[0] but I don't think we ever plan to read this field (other than
maybe ptdump). The POIndex field is set from the vma->vm_page_prot (Joey
may need to adjust vm_get_page_prot() in his patches to avoid setting a
pkey on a PROT_NONE mapping).
--
Catalin
Powered by blists - more mailing lists