[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251028124817.GH760669@ziepe.ca>
Date: Tue, 28 Oct 2025 09:48:17 -0300
From: Jason Gunthorpe <jgg@...pe.ca>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Janosch Frank <frankja@...ux.ibm.com>,
Claudio Imbrenda <imbrenda@...ux.ibm.com>,
David Hildenbrand <david@...hat.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>,
Vasily Gorbik <gor@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>, Zi Yan <ziy@...dia.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
Dev Jain <dev.jain@....com>, Barry Song <baohua@...nel.org>,
Lance Yang <lance.yang@...ux.dev>,
Kemeng Shi <shikemeng@...weicloud.com>,
Kairui Song <kasong@...cent.com>, Nhat Pham <nphamcs@...il.com>,
Baoquan He <bhe@...hat.com>, Chris Li <chrisl@...nel.org>,
Peter Xu <peterx@...hat.com>, Matthew Wilcox <willy@...radead.org>,
Leon Romanovsky <leon@...nel.org>,
Muchun Song <muchun.song@...ux.dev>,
Oscar Salvador <osalvador@...e.de>,
Vlastimil Babka <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>,
Michal Hocko <mhocko@...e.com>, Jann Horn <jannh@...gle.com>,
Matthew Brost <matthew.brost@...el.com>,
Joshua Hahn <joshua.hahnjy@...il.com>, Rakie Kim <rakie.kim@...com>,
Byungchul Park <byungchul@...com>,
Gregory Price <gourry@...rry.net>,
Ying Huang <ying.huang@...ux.alibaba.com>,
Alistair Popple <apopple@...dia.com>,
Pedro Falcato <pfalcato@...e.de>,
Pasha Tatashin <pasha.tatashin@...een.com>,
Rik van Riel <riel@...riel.com>, Harry Yoo <harry.yoo@...cle.com>,
kvm@...r.kernel.org, linux-s390@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [RFC PATCH 00/12] remove is_swap_[pte, pmd]() + non-swap
confusion
On Mon, Oct 27, 2025 at 05:33:57PM +0000, Lorenzo Stoakes wrote:
> (Note I never intended this to be an RFC, it was only because of
> series-likely-to-be-dropped causing nasty conflicts this isn't an 'out
> there' series rather a practical submission).
>
> To preface, as I said elsewhere, I intend to do more on this, renaming
> swp_entry_t to probably leaf_entry_t (thanks Gregory!)
>
> The issue is no matter how I do this people will theorise different
> approaches, I'm trying to practically find a way forward that works
> iteratively.
It is why I suggested that swp_entry_t is the name we have (for this
series at least) and lean into it as the proper name for the abstract
idea of a multi-type'd value. Having a following series to rename
"swp_entry_t" to some "leaf entry" will resolve the poor naming.
But for now, "swp_entry_t" does not mean *swap* entry, it means "leaf
entry with a really bad type name".
And swpent_* is the namespace prefix for things dealing with
swp_entry_t.
If done consistently then the switch to leaf entry naming is just a
simple mass rename of swpent/leafent.
> > That suggests functions like this:
> >
> > swpent_is_swap()
> > swpent_is_migration()
> > ..
>
> The _whole point_ of this series is to separate out the idea that you're
> dealing with swap entries so I don't like swpent as a name obviously.
As you say we can't fix everything at once, but if you do the above
and then rename the end state would be
leafent_is_swap()
leafent_is_migration()
..
And that seems like a good end state.
So pick the small steps, either lean into swpent in this series as the
place holder for leafent in the next..
Or this seems like a good idea too:
> We could also just pre-empt and prefix functions with leafent_is_swap() if
> you prefer.
>
> We could even do:
>
> /* TODO: Rename swap_entry_t to leaf_entry_t */
> typedef swap_entry_t leaf_entry_t;
>
> And use the new type right away.
Then the followup series is cleaning away swap_entry_t as a name.
> > /* True if the pte is a swpent_is_swap() */
> > static inline bool swpent_get_swap_pte(pte_t pte, swp_entry_t *entryp)
> > {
> > if (pte_present(pte))
> > return false;
> > *swpent = pte_to_swp_entry(pte);
> > return swpent_is_swap(*swpent);
> > }
>
> I already implement in the series a pte_to_swp_entry_or_zero() function
I saw, but I don't think it is a great name.. It doesn't really give
"zero" it gives a swp_entry_t that doesn't pass any of the
swpent_is_XX() functions. ie a none type.
> that goes one further - checks pte_present() for you, if pte_none() you
> just get an empty swap entry, so this can be:
And I was hoping to see a path to get rid of the pte_none() stuff, or
at least on most arches. It is pretty pointless to check for pte_none
if the arch has a none-pte that already is 0..
So pte_none can be more like:
swpent_is_none(pte_to_swp_entry(pte))
Where pte_to_swp_entry is just some bit maths with no conditionals.
> > I also think it will be more readable to keep all these things under a
> > swpent namespace instead of using unstructured english names.
>
> Nope. Again, the whole point of the series is to avoid referencing
> swap. swpent_xxx() is just eliminating the purpose of the series right?
>
> Yes it sucks that the type name is what it is, but this is an iterative
> process.
Sure, but don't add a bunch of new names with *no namespace*. As above
either accept swpent is a placeholder for leafent in the next series,
or do this:
> But as above, we could pre-empt future changes and prefix with a
> leafent_*() prefix if that works for you?
Which seems like a good idea to me.
> > I'd expect a safe function should be more like
> >
> > *swpent = pte_to_swp_entry_safe(pte);
> > return swpent_is_swap(*swpent);
> >
> > Where "safe" means that if the PTE is None or Present then
> > swpent_is_XX() == false. Ie it returns a 0 swpent and 0 swpent is
> > always nothing.
>
> Not sure it's really 'safe', the name is unfortunate, but you could read
> this as 'always get a valid swap entry to operate on'...
My suggestion was the leaf entry has a type {none, swap, migration, etc}
And this _safe version returns the none type'd leaf entry for a
present pte.
We move toward eliminating the idea of pte_none by saying a
non-present pte is always a leaf_entry and what we call a "none pte"
is a "none leaf entry"
> leaf_entry_t leafent_from_pte()...?
Probably this one?
> > static inline bool get_pte_swap_entry(pte_t pte, swp_entry_t *entryp)
> > {
> > return swpent_is_swap(*swpent = pte_to_swp_entry_safe(pte));
> > }
>
> I absolutely hate that embedded assignment, but this is equivalent to what
> I suggested above, so agreed this is a good suggestion broadly.
>
> >
> > Maybe it doesn't even need an inline at that point?
>
> Don't understand what you mean by that. It's in a header file?
I mean just write it like this in the callers:
swp_entry_t leafent = pte_to_swp_entry_safe(pte);
if (swpent_is_swap(leafent)) {
}
It is basically the same # lines as the helper version.
> > > * is_huge_pmd() - Determines if a PMD contains either a present transparent
> > > huge page entry or a huge non-present entry. This again simplifies a lot
> > > of logic that simply open-coded this.
> >
> > is_huge_or_swpent_pmd() would be nicer, IMHO. I think it is surprising
> > when any of these APIs accept swap entries without being explicit
>
> Again, I'm not going to reference swap in a series intended to eliminate
> this, it defeats the purpose.
>
> And the non-present (or whatever you want to call it) entry _is_ huge. So
> it's just adding more confusion that way IMO.
Then this:
pmd_is_present_or_leafent(pmd)
Jason
Powered by blists - more mailing lists