linux-kernel - [PATCH 1/4] mm/mprotect: Retry on pmd_trans

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20230602230552.350731-2-peterx@redhat.com>
Date:   Fri,  2 Jun 2023 19:05:49 -0400
From:   Peter Xu <peterx@...hat.com>
To:     linux-kernel@...r.kernel.org, linux-mm@...ck.org
Cc:     David Hildenbrand <david@...hat.com>,
        Alistair Popple <apopple@...dia.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        "Kirill A . Shutemov" <kirill@...temov.name>,
        Johannes Weiner <hannes@...xchg.org>,
        John Hubbard <jhubbard@...dia.com>,
        Naoya Horiguchi <naoya.horiguchi@....com>, peterx@...hat.com,
        Muhammad Usama Anjum <usama.anjum@...labora.com>,
        Hugh Dickins <hughd@...gle.com>,
        Mike Rapoport <rppt@...nel.org>
Subject: [PATCH 1/4] mm/mprotect: Retry on pmd_trans_unstable()

When hit unstable pmd, we should retry the pmd once more because it means
we probably raced with a thp insertion.

Skipping it might be a problem as no error will be reported to the caller.
I assume it means the user will expect prot changed (e.g. mprotect or
userfaultfd wr-protections) applied but it's actually not.

To achieve it, move the pmd_trans_unstable() call out of change_pte_range()
which will make the retry easier, as we can keep the retval of
change_pte_range() untouched.

Signed-off-by: Peter Xu <peterx@...hat.com>
---
 mm/mprotect.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 92d3d3ca390a..e4756899d40c 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -94,15 +94,6 @@ static long change_pte_range(struct mmu_gather *tlb,
 
 	tlb_change_page_size(tlb, PAGE_SIZE);
 
-	/*
-	 * Can be called with only the mmap_lock for reading by
-	 * prot_numa so we must check the pmd isn't constantly
-	 * changing from under us from pmd_none to pmd_trans_huge
-	 * and/or the other way around.
-	 */
-	if (pmd_trans_unstable(pmd))
-		return 0;
-
 	/*
 	 * The pmd points to a regular pte so the pmd can't change
 	 * from under us even if the mmap_lock is only hold for
@@ -411,6 +402,7 @@ static inline long change_pmd_range(struct mmu_gather *tlb,
 			pages = ret;
 			break;
 		}
+again:
 		/*
 		 * Automatic NUMA balancing walks the tables with mmap_lock
 		 * held for read. It's possible a parallel update to occur
@@ -465,6 +457,16 @@ static inline long change_pmd_range(struct mmu_gather *tlb,
 			}
 			/* fall through, the trans huge pmd just split */
 		}
+
+		/*
+		 * Can be called with only the mmap_lock for reading by
+		 * prot_numa or userfaultfd-wp, so we must check the pmd
+		 * isn't constantly changing from under us from pmd_none to
+		 * pmd_trans_huge and/or the other way around.
+		 */
+		if (pmd_trans_unstable(pmd))
+			goto again;
+
 		pages += change_pte_range(tlb, vma, pmd, addr, next,
 					  newprot, cp_flags);
 next:
-- 
2.40.1