linux-kernel - Re: [PATCH v1 1/2] arm64/mm: Move PTE_PROT_NONE and PMD_PRESENT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f5de5685-d955-4aa0-a307-a4da927f36f0@arm.com>
Date: Mon, 29 Apr 2024 11:04:53 +0100
From: Ryan Roberts <ryan.roberts@....com>
To: Catalin Marinas <catalin.marinas@....com>
Cc: David Hildenbrand <david@...hat.com>, Will Deacon <will@...nel.org>,
 Joey Gouly <joey.gouly@....com>, Ard Biesheuvel <ardb@...nel.org>,
 Mark Rutland <mark.rutland@....com>,
 Anshuman Khandual <anshuman.khandual@....com>, Peter Xu <peterx@...hat.com>,
 Mike Rapoport <rppt@...ux.ibm.com>, Shivansh Vij <shivanshvij@...look.com>,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 1/2] arm64/mm: Move PTE_PROT_NONE and
 PMD_PRESENT_INVALID

On 26/04/2024 15:48, Catalin Marinas wrote:
> On Thu, Apr 25, 2024 at 11:37:42AM +0100, Ryan Roberts wrote:
>> Also, IMHO we shouldn't really need to reserve PMD_PRESENT_INVALID for swap
>> ptes; it would be cleaner to have one bit that defines "present" when valid is
>> clear (similar to PTE_PROT_NONE today) then another bit which is only defined
>> when "present && !valid" which tells us if this is PTE_PROT_NONE or
>> PMD_PRESENT_INVALID (I don't think you can ever have both at the same time?).
> 
> I think this make sense, maybe rename the above to PTE_PRESENT_INVALID
> and use it for both ptes and pmds.

Yep, sounds good. I've already got a patch to do this, but it's exposed a bug in
core-mm so will now fix that before I can validate my change. see
https://lore.kernel.org/linux-arm-kernel/ZiuyGXt0XWwRgFh9@x1n/

With this in place, I'm proposing to remove PTE_PROT_NONE entirely and instead
represent PROT_NONE as a present but invalid pte (PTE_VALID=0, PTE_INVALID=1)
with both PTE_WRITE=0 and PTE_RDONLY=0.

While the HW would interpret PTE_WRITE=0/PTE_RDONLY=0 as "RW without dirty bit
modification", this is not a problem as the pte is invalid, so the HW doesn't
interpret it. And SW always uses the PTE_WRITE bit to interpret the writability
of the pte. So PTE_WRITE=0/PTE_RDONLY=0 was previously an unused combination
that we now repurpose for PROT_NONE.

This will subtly change behaviour in an edge case though. Imagine:

pte_t pte;

pte = pte_modify(pte, PAGE_NONE);
pte = pte_mkwrite_novma(pte);
WARN_ON(pte_protnone(pte));

Should that warning fire or not? Previously, because we had a dedicated bit for
PTE_PROT_NONE it would fire. With my proposed change it will not fire. To me
it's more intuitive if it doesn't fire. Regardless there is no core code that
ever does this. Once you have a protnone pte, its terminal - nothing ever
modifies it with these helpers AFAICS.

Personally I think this is a nice tidy up that saves a SW bit in both present
and swap ptes. What do you think? (I'll just post the series if its easier to
provide feedback in that context).

> 
>> But there is a problem with this: __split_huge_pmd_locked() calls
>> pmdp_invalidate() for a pmd before it determines that it is pmd_present(). So
>> the PMD_PRESENT_INVALID can be set in a swap pte today. That feels wrong to me,
>> but was trying to avoid the whole thing unravelling so didn't persue.
> 
> Maybe what's wrong is the arm64 implementation setting this bit on a
> swap/migration pmd (though we could handle this in the core code as
> well, it depends what the other architectures do). The only check for
> the PMD_PRESENT_INVALID bit is in the arm64 code and it can be absorbed
> into the pmd_present() check. I think it is currently broken as
> pmd_present() can return true for a swap pmd after pmd_mkinvalid().

I've posted a fix here:
https://lore.kernel.org/linux-mm/20240425170704.3379492-1-ryan.roberts@arm.com/

My position is that you shouldn't be calling pmd_mkinvalid() on a non-present pmd.

> 
> So I don't think we lose anything if pmd_mkinvalid() skips any bit
> setting when !PTE_VALID. Maybe it even fixes some corner case we never
> hit yet (like pmd_present() on a swap/migration+invalid pmd).
>