[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ce71f42f-e80d-4bae-9b8d-d09fe8bd1527@lucifer.local>
Date: Tue, 28 Oct 2025 18:20:54 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Janosch Frank <frankja@...ux.ibm.com>,
Claudio Imbrenda <imbrenda@...ux.ibm.com>,
David Hildenbrand <david@...hat.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>, Zi Yan <ziy@...dia.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
Dev Jain <dev.jain@....com>, Barry Song <baohua@...nel.org>,
Lance Yang <lance.yang@...ux.dev>,
Kemeng Shi <shikemeng@...weicloud.com>,
Kairui Song <kasong@...cent.com>, Nhat Pham <nphamcs@...il.com>,
Baoquan He <bhe@...hat.com>, Chris Li <chrisl@...nel.org>,
Peter Xu <peterx@...hat.com>, Matthew Wilcox <willy@...radead.org>,
Leon Romanovsky <leon@...nel.org>, Muchun Song <muchun.song@...ux.dev>,
Oscar Salvador <osalvador@...e.de>, Vlastimil Babka <vbabka@...e.cz>,
Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Jann Horn <jannh@...gle.com>, Matthew Brost <matthew.brost@...el.com>,
Joshua Hahn <joshua.hahnjy@...il.com>, Rakie Kim <rakie.kim@...com>,
Byungchul Park <byungchul@...com>, Gregory Price <gourry@...rry.net>,
Ying Huang <ying.huang@...ux.alibaba.com>,
Alistair Popple <apopple@...dia.com>, Pedro Falcato <pfalcato@...e.de>,
Pasha Tatashin <pasha.tatashin@...een.com>,
Rik van Riel <riel@...riel.com>, Harry Yoo <harry.yoo@...cle.com>,
kvm@...r.kernel.org, linux-s390@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [RFC PATCH 00/12] remove is_swap_[pte, pmd]() + non-swap
confusion
On Tue, Oct 28, 2025 at 09:48:17AM -0300, Jason Gunthorpe wrote:
> On Mon, Oct 27, 2025 at 05:33:57PM +0000, Lorenzo Stoakes wrote:
> > (Note I never intended this to be an RFC, it was only because of
> > series-likely-to-be-dropped causing nasty conflicts this isn't an 'out
> > there' series rather a practical submission).
> >
> > To preface, as I said elsewhere, I intend to do more on this, renaming
> > swp_entry_t to probably leaf_entry_t (thanks Gregory!)
> >
> > The issue is no matter how I do this people will theorise different
> > approaches, I'm trying to practically find a way forward that works
> > iteratively.
>
> It is why I suggested that swp_entry_t is the name we have (for this
> series at least) and lean into it as the proper name for the abstract
> idea of a multi-type'd value. Having a following series to rename
> "swp_entry_t" to some "leaf entry" will resolve the poor naming.
This is addressed below.
> But for now, "swp_entry_t" does not mean *swap* entry, it means "leaf
> entry with a really bad type name".
Yes.
>
> And swpent_* is the namespace prefix for things dealing with
> swp_entry_t.
>
> If done consistently then the switch to leaf entry naming is just a
> simple mass rename of swpent/leafent.
>
> > > That suggests functions like this:
> > >
> > > swpent_is_swap()
> > > swpent_is_migration()
> > > ..
> >
> > The _whole point_ of this series is to separate out the idea that you're
> > dealing with swap entries so I don't like swpent as a name obviously.
>
> As you say we can't fix everything at once, but if you do the above
> and then rename the end state would be
>
> leafent_is_swap()
> leafent_is_migration()
> ..
>
> And that seems like a good end state.
This is a two wrongs don't make a right situation.
I don't want to belabour this because we ultimately agree using
leafent_xxx() now is fine.
>
> So pick the small steps, either lean into swpent in this series as the
> place holder for leafent in the next..
>
> Or this seems like a good idea too:
>
> > We could also just pre-empt and prefix functions with leafent_is_swap() if
> > you prefer.
Good. I may even go so far as to say 'thank science we agree on that' ;)
Yes I'll do this.
> >
> > We could even do:
> >
> > /* TODO: Rename swap_entry_t to leaf_entry_t */
> > typedef swap_entry_t leaf_entry_t;
BTW typo, obv. meant swp_entry_t here...
> >
> > And use the new type right away.
>
> Then the followup series is cleaning away swap_entry_t as a name.
OK so you're good with the typedef? This would be quite nice actually as we
could then use leaf_entry_t in all the core leafent_xxx() logic ahead of
time and reduce confusion _there_ and effectively document that swp_entry_t
is just badly named.
This follow up series is one I very much intend to do, it's just going to
be a big churny one (hey my speciality anyway) but one which is best done
entirely mechanically I think.
>
> > > /* True if the pte is a swpent_is_swap() */
> > > static inline bool swpent_get_swap_pte(pte_t pte, swp_entry_t *entryp)
> > > {
> > > if (pte_present(pte))
> > > return false;
> > > *swpent = pte_to_swp_entry(pte);
> > > return swpent_is_swap(*swpent);
> > > }
> >
> > I already implement in the series a pte_to_swp_entry_or_zero() function
>
> I saw, but I don't think it is a great name.. It doesn't really give
> "zero" it gives a swp_entry_t that doesn't pass any of the
> swpent_is_XX() functions. ie a none type.
Naming is hard...
I mean really it wouldn't be all too awful to have pte_to_leafent() do this
now...
>
> > that goes one further - checks pte_present() for you, if pte_none() you
> > just get an empty swap entry, so this can be:
>
> And I was hoping to see a path to get rid of the pte_none() stuff, or
> at least on most arches. It is pretty pointless to check for pte_none
> if the arch has a none-pte that already is 0..
>
> So pte_none can be more like:
> swpent_is_none(pte_to_swp_entry(pte))
>
> Where pte_to_swp_entry is just some bit maths with no conditionals.
*leafent
I mean I'm not so sure that's all that useful, you often want to skip over
things that are 'none' entries without doing this conversion.
We could use the concept of 'none is an empty leaf_entry_t' more thoroughly
internally in functions though.
I will see what I can do.
>
> > > I also think it will be more readable to keep all these things under a
> > > swpent namespace instead of using unstructured english names.
> >
> > Nope. Again, the whole point of the series is to avoid referencing
> > swap. swpent_xxx() is just eliminating the purpose of the series right?
> >
> > Yes it sucks that the type name is what it is, but this is an iterative
> > process.
>
> Sure, but don't add a bunch of new names with *no namespace*. As above
> either accept swpent is a placeholder for leafent in the next series,
> or do this:
>
> > But as above, we could pre-empt future changes and prefix with a
> > leafent_*() prefix if that works for you?
>
> Which seems like a good idea to me.
Yup. We agree on this.
>
> > > I'd expect a safe function should be more like
> > >
> > > *swpent = pte_to_swp_entry_safe(pte);
> > > return swpent_is_swap(*swpent);
> > >
> > > Where "safe" means that if the PTE is None or Present then
> > > swpent_is_XX() == false. Ie it returns a 0 swpent and 0 swpent is
> > > always nothing.
> >
> > Not sure it's really 'safe', the name is unfortunate, but you could read
> > this as 'always get a valid swap entry to operate on'...
>
> My suggestion was the leaf entry has a type {none, swap, migration, etc}
>
> And this _safe version returns the none type'd leaf entry for a
> present pte.
I mean that's already what's happening more or less with the ..._is_zero()
function (albeit needing a rename).
>
> We move toward eliminating the idea of pte_none by saying a
> non-present pte is always a leaf_entry and what we call a "none pte"
> is a "none leaf entry"
Well as discussed above.
>
> > leaf_entry_t leafent_from_pte()...?
>
> Probably this one?
> > > static inline bool get_pte_swap_entry(pte_t pte, swp_entry_t *entryp)
> > > {
> > > return swpent_is_swap(*swpent = pte_to_swp_entry_safe(pte));
> > > }
> >
> > I absolutely hate that embedded assignment, but this is equivalent to what
> > I suggested above, so agreed this is a good suggestion broadly.
> >
> > >
> > > Maybe it doesn't even need an inline at that point?
> >
> > Don't understand what you mean by that. It's in a header file?
>
> I mean just write it like this in the callers:
>
> swp_entry_t leafent = pte_to_swp_entry_safe(pte);
>
> if (swpent_is_swap(leafent)) {
> }
>
> It is basically the same # lines as the helper version.
Right, good point!
>
> > > > * is_huge_pmd() - Determines if a PMD contains either a present transparent
> > > > huge page entry or a huge non-present entry. This again simplifies a lot
> > > > of logic that simply open-coded this.
> > >
> > > is_huge_or_swpent_pmd() would be nicer, IMHO. I think it is surprising
> > > when any of these APIs accept swap entries without being explicit
> >
> > Again, I'm not going to reference swap in a series intended to eliminate
> > this, it defeats the purpose.
> >
> > And the non-present (or whatever you want to call it) entry _is_ huge. So
> > it's just adding more confusion that way IMO.
>
> Then this:
>
> pmd_is_present_or_leafent(pmd)
A PMD can be present and contain an entry pointing at a PTE table so I'm
not sure that helps... naming is hard :)
Will think of alternatives on respin.
>
> Jason
Thanks, Lorenzo
Powered by blists - more mailing lists