lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <70dcdad7-5952-48ce-a9b9-042cfea59a5d@arm.com>
Date:   Tue, 5 Dec 2023 14:43:01 +0000
From:   Ryan Roberts <ryan.roberts@....com>
To:     James Houghton <jthoughton@...gle.com>,
        Steve Capper <steve.capper@....com>,
        Will Deacon <will@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     Mike Kravetz <mike.kravetz@...cle.com>,
        Muchun Song <songmuchun@...edance.com>,
        Anshuman Khandual <anshuman.khandual@....com>,
        Catalin Marinas <catalin.marinas@....com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/2] arm64: hugetlb: Fix page fault loop for
 sw-dirty/hw-clean contiguous PTEs

On 04/12/2023 17:26, James Houghton wrote:
> It is currently possible for a userspace application to enter a page
> fault loop when using HugeTLB pages implemented with contiguous PTEs
> when HAFDBS is not available. This happens because:
> 1. The kernel may sometimes write PTEs that are sw-dirty but hw-clean
>    (PTE_DIRTY | PTE_RDONLY | PTE_WRITE).

Hi James,

Do you know how this happens?

AFAIK, this is the set of valid bit combinations, and
PTE_RDONLY|PTE_WRITE|PTE_DIRTY is not one of them. Perhaps the real solution is
to understand how this is happening and prevent it?

/*
 * PTE bits configuration in the presence of hardware Dirty Bit Management
 * (PTE_WRITE == PTE_DBM):
 *
 * Dirty  Writable | PTE_RDONLY  PTE_WRITE  PTE_DIRTY (sw)
 *   0      0      |   1           0          0
 *   0      1      |   1           1          0
 *   1      0      |   1           0          1
 *   1      1      |   0           1          x
 *
 * When hardware DBM is not present, the sofware PTE_DIRTY bit is updated via
 * the page fault mechanism. Checking the dirty status of a pte becomes:
 *
 *   PTE_DIRTY || (PTE_WRITE && !PTE_RDONLY)
 */


Thanks,
Ryan


> 2. If, during a write, the CPU uses a sw-dirty, hw-clean PTE in handling
>    the memory access on a system without HAFDBS, we will get a page
>    fault.
> 3. HugeTLB will check if it needs to update the dirty bits on the PTE.
>    For contiguous PTEs, it will check to see if the pgprot bits need
>    updating. In this case, HugeTLB wants to write a sequence of
>    sw-dirty, hw-dirty PTEs, but it finds that all the PTEs it is about
>    to overwrite are all pte_dirty() (pte_sw_dirty() => pte_dirty()),
>    so it thinks no update is necessary.
> 
> Please see this[1] reproducer.
> 
> I think (though I may be wrong) that both step (1) and step (3) are
> buggy.
> 
> The first patch in this series fixes step (3); instead of checking if
> pte_dirty is matching in __cont_access_flags_changed, check pte_hw_dirty
> and pte_sw_dirty separately.
> 
> The second patch in this series makes step (1) less likely to occur.
> Without this patch, we can get the kernel to write a sw-dirty, hw-clean
> PTE with the following steps (showing the relevant VMA flags and pgprot
> bits):
> i.   Create a valid, writable contiguous PTE.
>        VMA vmflags:     VM_SHARED | VM_READ | VM_WRITE
>        VMA pgprot bits: PTE_RDONLY | PTE_WRITE
>        PTE pgprot bits: PTE_DIRTY | PTE_WRITE
> ii.  mprotect the VMA to PROT_NONE.
>        VMA vmflags:     VM_SHARED
>        VMA pgprot bits: PTE_RDONLY
>        PTE pgprot bits: PTE_DIRTY | PTE_RDONLY
> iii. mprotect the VMA back to PROT_READ | PROT_WRITE.
>        VMA vmflags:     VM_SHARED | VM_READ | VM_WRITE
>        VMA pgprot bits: PTE_RDONLY | PTE_WRITE
>        PTE pgprot bits: PTE_DIRTY | PTE_WRITE | PTE_RDONLY
> 
> Applying either one of the two patches in this patchset will fix the
> particular issue with HugeTLB pages implemented with contiguous PTEs.
> It's possible that only one of these patches should be taken, or that
> the right fix is something else entirely.
> 
> [1]: https://gist.github.com/48ca/11d1e466deee032cb35aa8c2280f93b0
> 
> James Houghton (2):
>   arm64: hugetlb: Distinguish between hw and sw dirtiness in
>     __cont_access_flags_changed
>   arm64: mm: Always make sw-dirty PTEs hw-dirty in pte_modify
> 
>  arch/arm64/include/asm/pgtable.h | 6 ++++++
>  arch/arm64/mm/hugetlbpage.c      | 5 ++++-
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> 
> base-commit: 645a9a454fdb7e698a63a275edca6a17ef97afc4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ