[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230612212723.196693-1-sj@kernel.org>
Date: Mon, 12 Jun 2023 21:27:23 +0000
From: SeongJae Park <sj@...nel.org>
To: Ryan Roberts <ryan.roberts@....com>
Cc: "SeongJae Park" <sj@...nel.org>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
"Mike Rapoport" <rppt@...nel.org>, "Yu Zhao" <yuzhao@...gle.com>,
"Jason Gunthorpe" <jgg@...pe.ca>,
"David Airlie" <airlied@...il.com>,
"Daniel Vetter" <daniel@...ll.ch>,
"Dimitri Sivanich" <dimitri.sivanich@....com>,
"Alex Williamson" <alex.williamson@...hat.com>,
"Oleksandr Tyshchenko" <oleksandr_tyshchenko@...m.com>,
"Alexander Viro" <viro@...iv.linux.org.uk>,
"Christian Brauner" <brauner@...nel.org>,
"Mike Kravetz" <mike.kravetz@...cle.com>,
"Muchun Song" <muchun.song@...ux.dev>,
"Mark Rutland" <mark.rutland@....com>,
"Alexander Shishkin" <alexander.shishkin@...ux.intel.com>,
"Jiri Olsa" <jolsa@...nel.org>,
"Namhyung Kim" <namhyung@...nel.org>,
"Ian Rogers" <irogers@...gle.com>,
"Adrian Hunter" <adrian.hunter@...el.com>,
Jérôme Glisse <jglisse@...hat.com>,
"Andrey Ryabinin" <ryabinin.a.a@...il.com>,
"Alexander Potapenko" <glider@...gle.com>,
"Andrey Konovalov" <andreyknvl@...il.com>,
"Dmitry Vyukov" <dvyukov@...gle.com>,
"Vincenzo Frascino" <vincenzo.frascino@....com>,
"Johannes Weiner" <hannes@...xchg.org>,
"Michal Hocko" <mhocko@...nel.org>,
"Roman Gushchin" <roman.gushchin@...ux.dev>,
"Shakeel Butt" <shakeelb@...gle.com>,
"Naoya Horiguchi" <naoya.horiguchi@....com>,
"Miaohe Lin" <linmiaohe@...wei.com>,
"Pasha Tatashin" <pasha.tatashin@...een.com>,
"Uladzislau Rezki" <urezki@...il.com>,
"Christoph Hellwig" <hch@...radead.org>,
"Lorenzo Stoakes" <lstoakes@...il.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
damon@...ts.linux.dev, kernel test robot <lkp@...el.com>
Subject: Re: [PATCH v3 3/3] mm: ptep_get() conversion
On Mon, 12 Jun 2023 16:15:45 +0100 Ryan Roberts <ryan.roberts@....com> wrote:
> Convert all instances of direct pte_t* dereferencing to instead use
> ptep_get() helper. This means that by default, the accesses change from
> a C dereference to a READ_ONCE(). This is technically the correct thing
> to do since where pgtables are modified by HW (for access/dirty) they
> are volatile and therefore we should always ensure READ_ONCE()
> semantics.
>
> But more importantly, by always using the helper, it can be overridden
> by the architecture to fully encapsulate the contents of the pte. Arch
> code is deliberately not converted, as the arch code knows best. It is
> intended that arch code (arm64) will override the default with its own
> implementation that can (e.g.) hide certain bits from the core code, or
> determine young/dirty status by mixing in state from another source.
>
> Conversion was done using Coccinelle:
>
> ----
>
> // $ make coccicheck \
> // COCCI=ptepget.cocci \
> // SPFLAGS="--include-headers" \
> // MODE=patch
>
> virtual patch
>
> @ depends on patch @
> pte_t *v;
> @@
>
> - *v
> + ptep_get(v)
>
> ----
>
> Then reviewed and hand-edited to avoid multiple unnecessary calls to
> ptep_get(), instead opting to store the result of a single call in a
> variable, where it is correct to do so. This aims to negate any cost of
> READ_ONCE() and will benefit arch-overrides that may be more complex.
>
> Included is a fix for an issue in an earlier version of this patch that
> was pointed out by kernel test robot. The issue arose because config
> MMU=n elides definition of the ptep helper functions, including
> ptep_get(). HUGETLB_PAGE=n configs still define a simple
> huge_ptep_clear_flush() for linking purposes, which dereferences the
> ptep. So when both configs are disabled, this caused a build error
> because ptep_get() is not defined. Fix by continuing to do a direct
> dereference when MMU=n. This is safe because for this config the arch
> code cannot be trying to virtualize the ptes because none of the ptep
> helpers are defined.
>
> Reported-by: kernel test robot <lkp@...el.com>
> Link: https://lore.kernel.org/oe-kbuild-all/202305120142.yXsNEo6H-lkp@intel.com/
> Signed-off-by: Ryan Roberts <ryan.roberts@....com>
> ---
> .../drm/i915/gem/selftests/i915_gem_mman.c | 8 +-
> drivers/misc/sgi-gru/grufault.c | 2 +-
> drivers/vfio/vfio_iommu_type1.c | 7 +-
> drivers/xen/privcmd.c | 2 +-
> fs/proc/task_mmu.c | 33 +++---
> fs/userfaultfd.c | 6 +-
> include/linux/hugetlb.h | 4 +
> include/linux/mm_inline.h | 2 +-
> include/linux/pgtable.h | 6 +-
> kernel/events/uprobes.c | 2 +-
> mm/damon/ops-common.c | 2 +-
> mm/damon/paddr.c | 2 +-
> mm/damon/vaddr.c | 10 +-
[...]
> diff --git a/mm/damon/ops-common.c b/mm/damon/ops-common.c
> index d4ab81229136..e940802a15a4 100644
> --- a/mm/damon/ops-common.c
> +++ b/mm/damon/ops-common.c
> @@ -39,7 +39,7 @@ struct folio *damon_get_folio(unsigned long pfn)
>
> void damon_ptep_mkold(pte_t *pte, struct vm_area_struct *vma, unsigned long addr)
> {
> - struct folio *folio = damon_get_folio(pte_pfn(*pte));
> + struct folio *folio = damon_get_folio(pte_pfn(ptep_get(pte)));
>
> if (!folio)
> return;
> diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
> index 5b3a3463d078..40801e38fcf0 100644
> --- a/mm/damon/paddr.c
> +++ b/mm/damon/paddr.c
> @@ -89,7 +89,7 @@ static bool __damon_pa_young(struct folio *folio, struct vm_area_struct *vma,
> while (page_vma_mapped_walk(&pvmw)) {
> addr = pvmw.address;
> if (pvmw.pte) {
> - *accessed = pte_young(*pvmw.pte) ||
> + *accessed = pte_young(ptep_get(pvmw.pte)) ||
> !folio_test_idle(folio) ||
> mmu_notifier_test_young(vma->vm_mm, addr);
> } else {
> diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
> index e814f66dfc2e..2fcc9731528a 100644
> --- a/mm/damon/vaddr.c
> +++ b/mm/damon/vaddr.c
> @@ -323,7 +323,7 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr,
> walk->action = ACTION_AGAIN;
> return 0;
> }
> - if (!pte_present(*pte))
> + if (!pte_present(ptep_get(pte)))
> goto out;
> damon_ptep_mkold(pte, walk->vma, addr);
> out:
> @@ -433,6 +433,7 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr,
> unsigned long next, struct mm_walk *walk)
> {
> pte_t *pte;
> + pte_t ptent;
> spinlock_t *ptl;
> struct folio *folio;
> struct damon_young_walk_private *priv = walk->private;
> @@ -471,12 +472,13 @@ static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr,
> walk->action = ACTION_AGAIN;
> return 0;
> }
> - if (!pte_present(*pte))
> + ptent = ptep_get(pte);
> + if (!pte_present(ptent))
> goto out;
> - folio = damon_get_folio(pte_pfn(*pte));
> + folio = damon_get_folio(pte_pfn(ptent));
> if (!folio)
> goto out;
> - if (pte_young(*pte) || !folio_test_idle(folio) ||
> + if (pte_young(ptent) || !folio_test_idle(folio) ||
> mmu_notifier_test_young(walk->mm, addr))
> priv->young = true;
> *priv->folio_sz = folio_size(folio);
For above DAMON part,
Reviewed-by: SeongJae Park <sj@...nel.org>
Thanks,
SJ
[...]
Powered by blists - more mailing lists