lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aGHNmKRng9H6kTqz@hyeyoo>
Date: Mon, 30 Jun 2025 08:34:48 +0900
From: Harry Yoo <harry.yoo@...cle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Byungchul Park <byungchul@...com>, willy@...radead.org,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, kernel_team@...ynix.com, almasrymina@...gle.com,
        ilias.apalodimas@...aro.org, hawk@...nel.org,
        akpm@...ux-foundation.org, davem@...emloft.net,
        john.fastabend@...il.com, andrew+netdev@...n.ch,
        asml.silence@...il.com, toke@...hat.com, tariqt@...dia.com,
        edumazet@...gle.com, pabeni@...hat.com, saeedm@...dia.com,
        leon@...nel.org, ast@...nel.org, daniel@...earbox.net,
        david@...hat.com, lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com,
        vbabka@...e.cz, rppt@...nel.org, surenb@...gle.com, mhocko@...e.com,
        horms@...nel.org, linux-rdma@...r.kernel.org, bpf@...r.kernel.org,
        vishal.moola@...il.com, hannes@...xchg.org, ziy@...dia.com,
        jackmanb@...gle.com
Subject: Re: [PATCH net-next v7 1/7] netmem: introduce struct netmem_desc
 mirroring struct page

On Fri, Jun 27, 2025 at 05:37:30PM -0700, Jakub Kicinski wrote:
> On Fri, 27 Jun 2025 12:54:05 +0900 Byungchul Park wrote:
> > On Thu, Jun 26, 2025 at 05:49:04PM -0700, Jakub Kicinski wrote:
> > > On Wed, 25 Jun 2025 13:33:44 +0900 Byungchul Park wrote:  
> > > > +/* A memory descriptor representing abstract networking I/O vectors,
> > > > + * generally for non-pages memory that doesn't have its corresponding
> > > > + * struct page and needs to be explicitly allocated through slab.  
> > > 
> > > I still don't get what your final object set is going to be.  
> > 
> > The ultimate goal is:
> > 
> >    Remove the pp fields from struct page
> > 
> > The second important goal is:
> > 
> >    Introduce a network pp descriptor, netmem_desc
> > 
> > While working on these two goals, I added some extra patches too, to
> > clean up related code if it's obvious e.g. patches for renaming and so
> > on.
> 
> Object set. Not objective.
> 
> > > We have
> > >  - CPU-readable buffers (struct page)
> > >  - un-readable buffers (struct net_iov)
> > >  - abstract reference which can be a pointer to either of the
> > >    above two (bitwise netmem_ref)
> > > 
> > > You say you want to evacuate page pool state from struct page
> > > so I'd expect you to add a type which can always be fed into
> > > some form of $type_to_virt(). A type which can always be cast
> > > to net_iov, but not vice versa. So why are you putting things
> > > inside net_iov, not outside.  
> > 
> > The type, struct netmem_desc, is declared outside.  Even though it's
> > used overlaying on struct page *for now*, it will be dynamically
> > allocated through slab shortly - it's also one of mm's plan.
> > 
> > As you know, net_iov is working with the assumption that it overlays on
> > struct page *for now* indeed, when it comes to netmem_ref.  See the
> > following APIs as example:
> > 
> > static inline struct net_iov *__netmem_clear_lsb(netmem_ref netmem)
> > {
> > 	return (struct net_iov *)((__force unsigned long)netmem & ~NET_IOV);
> > }
> > 
> > static inline void netmem_set_pp(netmem_ref netmem, struct page_pool *pool)
> > {
> > 	__netmem_clear_lsb(netmem)->pp = pool;
> > }
> > 
> > I'd say, I replaced the overlaying (on struct page) part with a
> > well-defined struct, netmem_desc that will play the role of struct page
> > for pp usage, instead of a set of the current overlaying fields of
> > net_iov.
> > 
> > This 'introduction of netmem_desc' patch can be the base for network
> > code to use netmem_desc as pp descriptor instead of struct page.  That's
> > what I meant.
> > 
> > Am I missing something or got you wrong?  If yes, please explain in more
> > detail then I will get back with the answer.
> 
> Ugh, you keep explaining the mechanics to me. Our goal here is not
> just to move fields around and make it still compile :/
> 
> Let me ask you this way: you said "netmem_desc" will be allocated
> thru slab "shortly". How will calling the equivalent of page_address()
> on netmem_desc work at that stage? Feel free to refer me to the existing
> docs if its covered..

https://kernelnewbies.org/MatthewWilcox/Memdescs/Path
https://kernelnewbies.org/MatthewWilcox/Memdescs

May not be the exact document you're looking for,
but with this article I can imagine:

- The ultimate goal is to shrink struct page to eventually from 64 bytes
  to 8 bytes, by allocating only the minimum required metadata per 4k page
  statically and moving the rest of metadata to dynamically-allocated
  descriptors (netmem_desc, anon, file, ptdesc, zpdesc, etc.) using slab
  at page allocation time.

- We can't achieve that goal just yet, because several subsystems
  still use struct page fields for their own purposes.

  To achieve that, each of these subsystems needs to define
  its own descriptor, which, for now, overlays struct page, and should be
  converted to use the new descriptor.

  Eventually, these descriptors will be allocated using slab.

- For CPU-readable buffers, page->memdesc will point to a netmem_desc,
  with a lower bit set indicating that it's a netmem_desc rather than
  other type. Networking code will need to cast it to (netmem_desc *)
  and dereference it to access networking specific fields.

- The struct page array (vmemmap) will still be statically allocated
  at boot time (or during memory hotplug time).
  So no change in how page_address() works.

net_iovs will continue to be not associated with struct pages,
as the buffers don't have corresponding struct pages.
net_iovs are already allocated using slab.

HTH

-- 
Cheers,
Harry / Hyeonggon

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ