[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230612131656.2ba4f95865f27e6b3b984936@linux-foundation.org>
Date: Mon, 12 Jun 2023 13:16:56 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Ryan Roberts <ryan.roberts@....com>
Cc: SeongJae Park <sj@...nel.org>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Mike Rapoport <rppt@...nel.org>, Yu Zhao <yuzhao@...gle.com>,
Jason Gunthorpe <jgg@...pe.ca>,
David Airlie <airlied@...il.com>,
Daniel Vetter <daniel@...ll.ch>,
Dimitri Sivanich <dimitri.sivanich@....com>,
Alex Williamson <alex.williamson@...hat.com>,
Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>,
Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
Muchun Song <muchun.song@...ux.dev>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Jérôme Glisse <jglisse@...hat.com>,
Andrey Ryabinin <ryabinin.a.a@...il.com>,
Alexander Potapenko <glider@...gle.com>,
Andrey Konovalov <andreyknvl@...il.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
Vincenzo Frascino <vincenzo.frascino@....com>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Shakeel Butt <shakeelb@...gle.com>,
Naoya Horiguchi <naoya.horiguchi@....com>,
Miaohe Lin <linmiaohe@...wei.com>,
Pasha Tatashin <pasha.tatashin@...een.com>,
Uladzislau Rezki <urezki@...il.com>,
Christoph Hellwig <hch@...radead.org>,
Lorenzo Stoakes <lstoakes@...il.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
damon@...ts.linux.dev
Subject: Re: [PATCH v3 0/3] Encapsulate PTE contents from non-arch code
On Mon, 12 Jun 2023 16:15:42 +0100 Ryan Roberts <ryan.roberts@....com> wrote:
> Hi All,
>
> (Including wider audience this time since changes touch a fair few subsystems)
>
> This is the second half of v3 of a series to improve the encapsulation of pte
> entries by disallowing non-arch code from directly dereferencing pte_t pointers.
That's basically all we have here for [0/N] cover letter content. I
stole some words from the [3/3] changelog, so we now have:
: A series to improve the encapsulation of pte entries by disallowing
: non-arch code from directly dereferencing pte_t pointers.
:
: This means that by default, the accesses change from a C dereference to a
: READ_ONCE(). This is technically the correct thing to do since where
: pgtables are modified by HW (for access/dirty) they are volatile and
: therefore we should always ensure READ_ONCE() semantics.
:
: But more importantly, by always using the helper, it can be overridden by
: the architecture to fully encapsulate the contents of the pte. Arch code
: is deliberately not converted, as the arch code knows best. It is
: intended that arch code (arm64) will override the default with its own
: implementation that can (e.g.) hide certain bits from the core code, or
: determine young/dirty status by mixing in state from another source.
> Based on earlier feedback, I split the series in 2; the first part, fixes for
> existing bugs, was already posted at [3] and merged into mm-stable. This second
> part contains the conversion from direct dereferences to instead use
> ptep_get()/ptep_get_lockless().
>
> See the v1 cover letter at [1] for rationale for this work.
>
> Based on feedback at v2, I've removed the new ptep_deref() helper I originally
> added, and am now using the existing ptep_get() and ptep_get_lockless() helpers.
> Testing on Ampere Altra (arm64) showed no difference in performance when using
> ptep_deref() (*pte) vs ptep_get() (READ_ONCE(*pte)).
>
> Patches are based on mm-unstable (49e038b1919e) and a branch is available at [4]
> (Let me know if this is the wrong branch to target - I'm still not familiar with
> the details of the mm- dev process!). Note that Hugh Dickins's "mm: allow
> pte_offset_map[_lock]() to fail" (now in mm-unstable) patch set caused a number
> of conflicts which I've resolved. But due to that, you won't be able to apply
> these patches on top of Linus's tree. I have an alternate branch on top of
> v6.4-rc6 at [5].
Yep, that's all great, thanks.
Is there some clever trick we can do to prevent new open-coded derefs
of pte_t* from being introduced?
I suppose we could convert pte_t to a single-member struct to force a
compile error. That struct will get passed by value to ptep_get() so
that's OK. But this isn't viable unless/until all architectures are
converted :(
Or we rely upon Ryan to grep the tree occasionally ;)
Powered by blists - more mailing lists