[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aGHNmKRng9H6kTqz@hyeyoo>
Date: Mon, 30 Jun 2025 08:34:48 +0900
From: Harry Yoo <harry.yoo@...cle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Byungchul Park <byungchul@...com>, willy@...radead.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, kernel_team@...ynix.com, almasrymina@...gle.com,
ilias.apalodimas@...aro.org, hawk@...nel.org,
akpm@...ux-foundation.org, davem@...emloft.net,
john.fastabend@...il.com, andrew+netdev@...n.ch,
asml.silence@...il.com, toke@...hat.com, tariqt@...dia.com,
edumazet@...gle.com, pabeni@...hat.com, saeedm@...dia.com,
leon@...nel.org, ast@...nel.org, daniel@...earbox.net,
david@...hat.com, lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com,
vbabka@...e.cz, rppt@...nel.org, surenb@...gle.com, mhocko@...e.com,
horms@...nel.org, linux-rdma@...r.kernel.org, bpf@...r.kernel.org,
vishal.moola@...il.com, hannes@...xchg.org, ziy@...dia.com,
jackmanb@...gle.com
Subject: Re: [PATCH net-next v7 1/7] netmem: introduce struct netmem_desc
mirroring struct page
On Fri, Jun 27, 2025 at 05:37:30PM -0700, Jakub Kicinski wrote:
> On Fri, 27 Jun 2025 12:54:05 +0900 Byungchul Park wrote:
> > On Thu, Jun 26, 2025 at 05:49:04PM -0700, Jakub Kicinski wrote:
> > > On Wed, 25 Jun 2025 13:33:44 +0900 Byungchul Park wrote:
> > > > +/* A memory descriptor representing abstract networking I/O vectors,
> > > > + * generally for non-pages memory that doesn't have its corresponding
> > > > + * struct page and needs to be explicitly allocated through slab.
> > >
> > > I still don't get what your final object set is going to be.
> >
> > The ultimate goal is:
> >
> > Remove the pp fields from struct page
> >
> > The second important goal is:
> >
> > Introduce a network pp descriptor, netmem_desc
> >
> > While working on these two goals, I added some extra patches too, to
> > clean up related code if it's obvious e.g. patches for renaming and so
> > on.
>
> Object set. Not objective.
>
> > > We have
> > > - CPU-readable buffers (struct page)
> > > - un-readable buffers (struct net_iov)
> > > - abstract reference which can be a pointer to either of the
> > > above two (bitwise netmem_ref)
> > >
> > > You say you want to evacuate page pool state from struct page
> > > so I'd expect you to add a type which can always be fed into
> > > some form of $type_to_virt(). A type which can always be cast
> > > to net_iov, but not vice versa. So why are you putting things
> > > inside net_iov, not outside.
> >
> > The type, struct netmem_desc, is declared outside. Even though it's
> > used overlaying on struct page *for now*, it will be dynamically
> > allocated through slab shortly - it's also one of mm's plan.
> >
> > As you know, net_iov is working with the assumption that it overlays on
> > struct page *for now* indeed, when it comes to netmem_ref. See the
> > following APIs as example:
> >
> > static inline struct net_iov *__netmem_clear_lsb(netmem_ref netmem)
> > {
> > return (struct net_iov *)((__force unsigned long)netmem & ~NET_IOV);
> > }
> >
> > static inline void netmem_set_pp(netmem_ref netmem, struct page_pool *pool)
> > {
> > __netmem_clear_lsb(netmem)->pp = pool;
> > }
> >
> > I'd say, I replaced the overlaying (on struct page) part with a
> > well-defined struct, netmem_desc that will play the role of struct page
> > for pp usage, instead of a set of the current overlaying fields of
> > net_iov.
> >
> > This 'introduction of netmem_desc' patch can be the base for network
> > code to use netmem_desc as pp descriptor instead of struct page. That's
> > what I meant.
> >
> > Am I missing something or got you wrong? If yes, please explain in more
> > detail then I will get back with the answer.
>
> Ugh, you keep explaining the mechanics to me. Our goal here is not
> just to move fields around and make it still compile :/
>
> Let me ask you this way: you said "netmem_desc" will be allocated
> thru slab "shortly". How will calling the equivalent of page_address()
> on netmem_desc work at that stage? Feel free to refer me to the existing
> docs if its covered..
https://kernelnewbies.org/MatthewWilcox/Memdescs/Path
https://kernelnewbies.org/MatthewWilcox/Memdescs
May not be the exact document you're looking for,
but with this article I can imagine:
- The ultimate goal is to shrink struct page to eventually from 64 bytes
to 8 bytes, by allocating only the minimum required metadata per 4k page
statically and moving the rest of metadata to dynamically-allocated
descriptors (netmem_desc, anon, file, ptdesc, zpdesc, etc.) using slab
at page allocation time.
- We can't achieve that goal just yet, because several subsystems
still use struct page fields for their own purposes.
To achieve that, each of these subsystems needs to define
its own descriptor, which, for now, overlays struct page, and should be
converted to use the new descriptor.
Eventually, these descriptors will be allocated using slab.
- For CPU-readable buffers, page->memdesc will point to a netmem_desc,
with a lower bit set indicating that it's a netmem_desc rather than
other type. Networking code will need to cast it to (netmem_desc *)
and dereference it to access networking specific fields.
- The struct page array (vmemmap) will still be statically allocated
at boot time (or during memory hotplug time).
So no change in how page_address() works.
net_iovs will continue to be not associated with struct pages,
as the buffers don't have corresponding struct pages.
net_iovs are already allocated using slab.
HTH
--
Cheers,
Harry / Hyeonggon
Powered by blists - more mailing lists