[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c69f57ff-c4b1-4fb9-8954-c5687dc2d904@lucifer.local>
Date: Wed, 12 Nov 2025 15:59:39 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Zi Yan <ziy@...dia.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Janosch Frank <frankja@...ux.ibm.com>,
Claudio Imbrenda <imbrenda@...ux.ibm.com>,
David Hildenbrand <david@...hat.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>, Peter Xu <peterx@...hat.com>,
Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
Arnd Bergmann <arnd@...db.de>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
Dev Jain <dev.jain@....com>, Barry Song <baohua@...nel.org>,
Lance Yang <lance.yang@...ux.dev>, Muchun Song <muchun.song@...ux.dev>,
Oscar Salvador <osalvador@...e.de>, Vlastimil Babka <vbabka@...e.cz>,
Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Matthew Brost <matthew.brost@...el.com>,
Joshua Hahn <joshua.hahnjy@...il.com>, Rakie Kim <rakie.kim@...com>,
Byungchul Park <byungchul@...com>, Gregory Price <gourry@...rry.net>,
Ying Huang <ying.huang@...ux.alibaba.com>,
Alistair Popple <apopple@...dia.com>,
Axel Rasmussen <axelrasmussen@...gle.com>,
Yuanchu Xie <yuanchu@...gle.com>, Wei Xu <weixugc@...gle.com>,
Kemeng Shi <shikemeng@...weicloud.com>,
Kairui Song <kasong@...cent.com>, Nhat Pham <nphamcs@...il.com>,
Baoquan He <bhe@...hat.com>, Chris Li <chrisl@...nel.org>,
SeongJae Park <sj@...nel.org>, Matthew Wilcox <willy@...radead.org>,
Jason Gunthorpe <jgg@...pe.ca>, Leon Romanovsky <leon@...nel.org>,
Xu Xin <xu.xin16@....com.cn>,
Chengming Zhou <chengming.zhou@...ux.dev>,
Jann Horn <jannh@...gle.com>, Miaohe Lin <linmiaohe@...wei.com>,
Naoya Horiguchi <nao.horiguchi@...il.com>,
Pedro Falcato <pfalcato@...e.de>,
Pasha Tatashin <pasha.tatashin@...een.com>,
Rik van Riel <riel@...riel.com>, Harry Yoo <harry.yoo@...cle.com>,
Hugh Dickins <hughd@...gle.com>, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org, linux-s390@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-arch@...r.kernel.org, damon@...ts.linux.dev
Subject: Re: [PATCH v3 03/16] mm: avoid unnecessary uses of is_swap_pte()
On Tue, Nov 11, 2025 at 09:58:36PM -0500, Zi Yan wrote:
> On 10 Nov 2025, at 17:21, Lorenzo Stoakes wrote:
>
> > There's an established convention in the kernel that we treat PTEs as
> > containing swap entries (and the unfortunately named non-swap swap entries)
> > should they be neither empty (i.e. pte_none() evaluating true) nor present
> > (i.e. pte_present() evaluating true).
> >
> > However, there is some inconsistency in how this is applied, as we also
> > have the is_swap_pte() helper which explicitly performs this check:
> >
> > /* check whether a pte points to a swap entry */
> > static inline int is_swap_pte(pte_t pte)
> > {
> > return !pte_none(pte) && !pte_present(pte);
> > }
> >
> > As this represents a predicate, and it's logical to assume that in order to
> > establish that a PTE entry can correctly be manipulated as a swap/non-swap
> > entry, this predicate seems as if it must first be checked.
> >
> > But we instead, we far more often utilise the established convention of
> > checking pte_none() / pte_present() before operating on entries as if they
> > were swap/non-swap.
> >
> > This patch works towards correcting this inconsistency by removing all uses
> > of is_swap_pte() where we are already in a position where we perform
> > pte_none()/pte_present() checks anyway or otherwise it is clearly logical
> > to do so.
> >
> > We also take advantage of the fact that pte_swp_uffd_wp() is only set on
> > swap entries.
> >
> > Additionally, update comments referencing to is_swap_pte() and
> > non_swap_entry().
> >
> > No functional change intended.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> > ---
> > fs/proc/task_mmu.c | 49 ++++++++++++++++++++++++-----------
> > include/linux/userfaultfd_k.h | 3 +--
> > mm/hugetlb.c | 6 ++---
> > mm/internal.h | 6 ++---
> > mm/khugepaged.c | 29 +++++++++++----------
> > mm/migrate.c | 2 +-
> > mm/mprotect.c | 43 ++++++++++++++----------------
> > mm/mremap.c | 7 +++--
> > mm/page_table_check.c | 13 ++++++----
> > mm/page_vma_mapped.c | 31 +++++++++++-----------
> > 10 files changed, 104 insertions(+), 85 deletions(-)
> >
>
> <snip>
>
> > diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> > index be20468fb5a9..a4e23818f37f 100644
> > --- a/mm/page_vma_mapped.c
> > +++ b/mm/page_vma_mapped.c
> > @@ -16,6 +16,7 @@ static inline bool not_found(struct page_vma_mapped_walk *pvmw)
> > static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
> > spinlock_t **ptlp)
> > {
> > + bool is_migration;
> > pte_t ptent;
> >
> > if (pvmw->flags & PVMW_SYNC) {
> > @@ -26,6 +27,7 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
> > return !!pvmw->pte;
> > }
> >
> > + is_migration = pvmw->flags & PVMW_MIGRATION;
> > again:
> > /*
> > * It is important to return the ptl corresponding to pte,
> > @@ -41,11 +43,14 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
> >
> > ptent = ptep_get(pvmw->pte);
> >
> > - if (pvmw->flags & PVMW_MIGRATION) {
> > - if (!is_swap_pte(ptent))
>
> Here, is_migration = true and either pte_none() or pte_present()
> would return false, and ...
>
> > + if (pte_none(ptent)) {
> > + return false;
> > + } else if (pte_present(ptent)) {
> > + if (is_migration)
> > return false;
> > - } else if (is_swap_pte(ptent)) {
> > + } else if (!is_migration) {
> > swp_entry_t entry;
> > +
> > /*
> > * Handle un-addressable ZONE_DEVICE memory.
> > *
> > @@ -66,8 +71,6 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
> > if (!is_device_private_entry(entry) &&
> > !is_device_exclusive_entry(entry))
> > return false;
> > - } else if (!pte_present(ptent)) {
> > - return false;
>
> ... is_migration = false and !pte_present() is actually pte_none(),
> because of the is_swap_pte() above the added !is_migration check.
> So pte_none() should return false regardless of is_migration.
I guess you were working this through :) well I decided to also just to
double-check I got it right, maybe useful for you also :P -
Previously:
if (is_migration) {
if (!is_swap_pte(ptent))
return false;
} else if (is_swap_pte(ptent)) {
... ZONE_DEVICE blah ...
} else if (!pte_present(ptent)) {
return false;
}
But is_swap_pte() is the same as !pte_none() && !pte_present(), so
!is_swap_pte() is pte_none() || pte_present() by De Morgan's law:
if (is_migration) {
if (pte_none(ptent) || pte_present(ptent))
return false;
} else if (!pte_none(ptent) && !pte_present(ptent)) {
... ZONE_DEVICE blah ...
} else if (!pte_present(ptent)) {
return false;
}
In the last branch, we know (again by De Morgan's law) that either
pte_none(ptent) or pte_present(ptent).. But we explicitly check for
!pte_present(ptent) so this becomes:
if (is_migration) {
if (pte_none(ptent) || pte_present(ptent))
return false;
} else if (!pte_none(ptent) && !pte_present(ptent)) {
... ZONE_DEVICE blah ...
} else if (pte_none(ptent)) {
return false;
}
So we can generalise - regardless of is_migration, pte_none() returns false:
if (pte_none(ptent)) {
return false;
} else if (is_migration) {
if (pte_none(ptent) || pte_present(ptent))
return false;
} else if (!pte_none(ptent) && !pte_present(ptent)) {
... ZONE_DEVICE blah ...
}
Since we already check for pte_none() ahead of time, we can simplify again:
if (pte_none(ptent)) {
return false;
} else if (is_migration) {
if (pte_present(ptent))
return false;
} else if (!pte_present(ptent)) {
... ZONE_DEVICE blah ...
}
We can then put the pte_present() check in the outer branch:
if (pte_none(ptent)) {
return false;
} else if (pte_present(ptent)) {
if (is_migration)
return false;
} else if (!is_migration) {
... ZONE_DEVICE blah ...
}
Because previously an is_migration && !pte_present() case would result in no
action here.
Which is the code in this patch :)
>
> This is a nice cleanup. Thanks.
>
> > }
> > spin_lock(*ptlp);
> > if (unlikely(!pmd_same(*pmdvalp, pmdp_get_lockless(pvmw->pmd)))) {
> > @@ -113,21 +116,17 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
> > return false;
> >
> > pfn = softleaf_to_pfn(entry);
> > - } else if (is_swap_pte(ptent)) {
> > - swp_entry_t entry;
> > + } else if (pte_present(ptent)) {
> > + pfn = pte_pfn(ptent);
> > + } else {
> > + const softleaf_t entry = softleaf_from_pte(ptent);
> >
> > /* Handle un-addressable ZONE_DEVICE memory */
> > - entry = pte_to_swp_entry(ptent);
> > - if (!is_device_private_entry(entry) &&
> > - !is_device_exclusive_entry(entry))
> > - return false;
> > -
> > - pfn = swp_offset_pfn(entry);
> > - } else {
> > - if (!pte_present(ptent))
>
> This !pte_present() is pte_none(). It seems that there should be
Well this should be fine though as:
const softleaf_t entry = softleaf_from_pte(ptent);
/* Handle un-addressable ZONE_DEVICE memory */
if (!softleaf_is_device_private(entry) &&
!softleaf_is_device_exclusive(entry))
return false;
Still correctly handles none - as softleaf_from_pte() in case of pte_none() will
be a none softleaf entry which will fail both of these tests.
So excluding pte_none() as an explicit test here was part of the rework - we no
longer have to do that.
>
> } else if (pte_none(ptent)) {
> return false;
> }
>
> before the above "} else {".
>
> > + if (!softleaf_is_device_private(entry) &&
> > + !softleaf_is_device_exclusive(entry))
> > return false;
> >
> > - pfn = pte_pfn(ptent);
> > + pfn = softleaf_to_pfn(entry);
> > }
> >
> > if ((pfn + pte_nr - 1) < pvmw->pfn)
> > --
> > 2.51.0
>
> Otherwise, LGTM. With the above issue addressed, feel free to
> add Reviewed-by: Zi Yan <ziy@...dia.com>
Thanks!
>
> --
> Best Regards,
> Yan, Zi
Cheers, Lorenzo
Powered by blists - more mailing lists