[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240523132139.289719-1-peterx@redhat.com>
Date: Thu, 23 May 2024 09:21:39 -0400
From: Peter Xu <peterx@...hat.com>
To: linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Cc: Andrew Morton <akpm@...ux-foundation.org>,
peterx@...hat.com,
David Hildenbrand <david@...hat.com>,
Pavel Tatashin <pasha.tatashin@...een.com>,
"Aneesh Kumar K . V" <aneesh.kumar@...ux.ibm.com>,
Gavin Shan <gshan@...hat.com>,
Anshuman Khandual <anshuman.khandual@....com>,
Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
Subject: [PATCH] mm/debug_vm_pgtable: Drop RANDOM_ORVALUE trick
Macro RANDOM_ORVALUE was used to make sure the pgtable entry will be
populated with !none data in clear tests.
The RANDOM_ORVALUE tried to cover mostly all the bits in a pgtable entry,
even if there's no discussion on whether all the bits will be vaild. Both
S390 and PPC64 have their own masks to avoid touching some bits. Now it's
the turn for x86_64.
The issue is there's a recent report from Mikhail Gavrilov showing that
this can cause a warning with the newly added pte set check in commit
8430557fc5 on writable v.s. userfaultfd-wp bit, even though the check
itself was valid, the random pte is not. We can choose to mask more bits
out.
However the need to have such random bits setup is questionable, as now
it's already guaranteed to be true on below:
- For pte level, the pgtable entry will be installed with value from
pfn_pte(), where pfn points to a valid page. Hence the pte will be
!none already if populated with pfn_pte().
- For upper-than-pte level, the pgtable entry should contain a directory
entry always, which is also !none.
All the cases look like good enough to test a pxx_clear() helper. Instead
of extending the bitmask, drop the "set random bits" trick completely. Add
some warning guards to make sure the entries will be !none before clear().
Cc: David Hildenbrand <david@...hat.com>
Cc: Pavel Tatashin <pasha.tatashin@...een.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@...ux.ibm.com>
Cc: Gavin Shan <gshan@...hat.com>
Cc: Anshuman Khandual <anshuman.khandual@....com>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
Link: https://lore.kernel.org/r/CABXGCsMB9A8-X+Np_Q+fWLURYL_0t3Y-MdoNabDM-Lzk58-DGA@mail.gmail.com
Signed-off-by: Peter Xu <peterx@...hat.com>
---
mm/debug_vm_pgtable.c | 31 +++++--------------------------
1 file changed, 5 insertions(+), 26 deletions(-)
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index f1c9a2c5abc0..866eddb6cfda 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -40,22 +40,7 @@
* Please refer Documentation/mm/arch_pgtable_helpers.rst for the semantics
* expectations that are being validated here. All future changes in here
* or the documentation need to be in sync.
- *
- * On s390 platform, the lower 4 bits are used to identify given page table
- * entry type. But these bits might affect the ability to clear entries with
- * pxx_clear() because of how dynamic page table folding works on s390. So
- * while loading up the entries do not change the lower 4 bits. It does not
- * have affect any other platform. Also avoid the 62nd bit on ppc64 that is
- * used to mark a pte entry.
*/
-#define S390_SKIP_MASK GENMASK(3, 0)
-#if __BITS_PER_LONG == 64
-#define PPC64_SKIP_MASK GENMASK(62, 62)
-#else
-#define PPC64_SKIP_MASK 0x0
-#endif
-#define ARCH_SKIP_MASK (S390_SKIP_MASK | PPC64_SKIP_MASK)
-#define RANDOM_ORVALUE (GENMASK(BITS_PER_LONG - 1, 0) & ~ARCH_SKIP_MASK)
#define RANDOM_NZVALUE GENMASK(7, 0)
struct pgtable_debug_args {
@@ -511,8 +496,7 @@ static void __init pud_clear_tests(struct pgtable_debug_args *args)
return;
pr_debug("Validating PUD clear\n");
- pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
- WRITE_ONCE(*args->pudp, pud);
+ WARN_ON(pud_none(pud));
pud_clear(args->pudp);
pud = READ_ONCE(*args->pudp);
WARN_ON(!pud_none(pud));
@@ -548,8 +532,7 @@ static void __init p4d_clear_tests(struct pgtable_debug_args *args)
return;
pr_debug("Validating P4D clear\n");
- p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
- WRITE_ONCE(*args->p4dp, p4d);
+ WARN_ON(p4d_none(p4d));
p4d_clear(args->p4dp);
p4d = READ_ONCE(*args->p4dp);
WARN_ON(!p4d_none(p4d));
@@ -582,8 +565,7 @@ static void __init pgd_clear_tests(struct pgtable_debug_args *args)
return;
pr_debug("Validating PGD clear\n");
- pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE);
- WRITE_ONCE(*args->pgdp, pgd);
+ WARN_ON(pgd_none(pgd));
pgd_clear(args->pgdp);
pgd = READ_ONCE(*args->pgdp);
WARN_ON(!pgd_none(pgd));
@@ -634,10 +616,8 @@ static void __init pte_clear_tests(struct pgtable_debug_args *args)
if (WARN_ON(!args->ptep))
return;
-#ifndef CONFIG_RISCV
- pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
-#endif
set_pte_at(args->mm, args->vaddr, args->ptep, pte);
+ WARN_ON(pte_none(pte));
flush_dcache_page(page);
barrier();
ptep_clear(args->mm, args->vaddr, args->ptep);
@@ -650,8 +630,7 @@ static void __init pmd_clear_tests(struct pgtable_debug_args *args)
pmd_t pmd = READ_ONCE(*args->pmdp);
pr_debug("Validating PMD clear\n");
- pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE);
- WRITE_ONCE(*args->pmdp, pmd);
+ WARN_ON(pmd_none(pmd));
pmd_clear(args->pmdp);
pmd = READ_ONCE(*args->pmdp);
WARN_ON(!pmd_none(pmd));
--
2.45.0
Powered by blists - more mailing lists