[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izMvRrG2wpE7HEyK3t544-wN_h3SC8nGabCoPWj1qCv_ag@mail.gmail.com>
Date: Tue, 27 May 2025 20:47:54 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Byungchul Park <byungchul@...com>
Cc: willy@...radead.org, netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, kernel_team@...ynix.com, kuba@...nel.org,
ilias.apalodimas@...aro.org, harry.yoo@...cle.com, hawk@...nel.org,
akpm@...ux-foundation.org, davem@...emloft.net, john.fastabend@...il.com,
andrew+netdev@...n.ch, asml.silence@...il.com, toke@...hat.com,
tariqt@...dia.com, edumazet@...gle.com, pabeni@...hat.com, saeedm@...dia.com,
leon@...nel.org, ast@...nel.org, daniel@...earbox.net, david@...hat.com,
lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, vbabka@...e.cz,
rppt@...nel.org, surenb@...gle.com, mhocko@...e.com, horms@...nel.org,
linux-rdma@...r.kernel.org, bpf@...r.kernel.org, vishal.moola@...il.com
Subject: Re: [PATCH 01/18] netmem: introduce struct netmem_desc
struct_group_tagged()'ed on struct net_iov
On Tue, May 27, 2025 at 6:22 PM Byungchul Park <byungchul@...com> wrote:
>
> On Tue, May 27, 2025 at 01:03:32PM -0700, Mina Almasry wrote:
> > On Mon, May 26, 2025 at 7:50 PM Byungchul Park <byungchul@...com> wrote:
> > >
> > > On Fri, May 23, 2025 at 12:25:52PM +0900, Byungchul Park wrote:
> > > > To simplify struct page, the page pool members of struct page should be
> > > > moved to other, allowing these members to be removed from struct page.
> > > >
> > > > Introduce a network memory descriptor to store the members, struct
> > > > netmem_desc, reusing struct net_iov that already mirrored struct page.
> > > >
> > > > While at it, relocate _pp_mapping_pad to group struct net_iov's fields.
> > > >
> > > > Signed-off-by: Byungchul Park <byungchul@...com>
> > > > ---
> > > > include/linux/mm_types.h | 2 +-
> > > > include/net/netmem.h | 43 +++++++++++++++++++++++++++++++++-------
> > > > 2 files changed, 37 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> > > > index 56d07edd01f9..873e820e1521 100644
> > > > --- a/include/linux/mm_types.h
> > > > +++ b/include/linux/mm_types.h
> > > > @@ -120,13 +120,13 @@ struct page {
> > > > unsigned long private;
> > > > };
> > > > struct { /* page_pool used by netstack */
> > > > + unsigned long _pp_mapping_pad;
> > > > /**
> > > > * @pp_magic: magic value to avoid recycling non
> > > > * page_pool allocated pages.
> > > > */
> > > > unsigned long pp_magic;
> > > > struct page_pool *pp;
> > > > - unsigned long _pp_mapping_pad;
> > > > unsigned long dma_addr;
> > > > atomic_long_t pp_ref_count;
> > > > };
> > > > diff --git a/include/net/netmem.h b/include/net/netmem.h
> > > > index 386164fb9c18..08e9d76cdf14 100644
> > > > --- a/include/net/netmem.h
> > > > +++ b/include/net/netmem.h
> > > > @@ -31,12 +31,41 @@ enum net_iov_type {
> > > > };
> > > >
> > > > struct net_iov {
> > > > - enum net_iov_type type;
> > > > - unsigned long pp_magic;
> > > > - struct page_pool *pp;
> > > > - struct net_iov_area *owner;
> > > > - unsigned long dma_addr;
> > > > - atomic_long_t pp_ref_count;
> > > > + /*
> > > > + * XXX: Now that struct netmem_desc overlays on struct page,
> > > > + * struct_group_tagged() should cover all of them. However,
> > > > + * a separate struct netmem_desc should be declared and embedded,
> > > > + * once struct netmem_desc is no longer overlayed but it has its
> > > > + * own instance from slab. The final form should be:
> > > > + *
> > > > + * struct netmem_desc {
> > > > + * unsigned long pp_magic;
> > > > + * struct page_pool *pp;
> > > > + * unsigned long dma_addr;
> > > > + * atomic_long_t pp_ref_count;
> > > > + * };
> > > > + *
> > > > + * struct net_iov {
> > > > + * enum net_iov_type type;
> > > > + * struct net_iov_area *owner;
> > > > + * struct netmem_desc;
> > > > + * };
> > > > + */
> > > > + struct_group_tagged(netmem_desc, desc,
> > >
> > > So.. For now, this is the best option we can pick. We can do all that
> > > you told me once struct netmem_desc has it own instance from slab.
> > >
> > > Again, it's because the page pool fields (or netmem things) from struct
> > > page will be gone by this series.
> > >
> > > Mina, thoughts?
> > >
> >
> > Can you please post an updated series with the approach you have in
> > mind? I think this series as-is seems broken vis-a-vie the
> > _pp_padding_map param move that looks incorrect. Pavel and I have also
> > commented on patch 18 that removing the ASSERTS seems incorrect as
> > it's breaking the symmetry between struct page and struct net_iov.
>
> I told you I will fix it. I will send the updated series shortly for
> *review*. However, it will be for review since we know this work can be
> completed once the next works have been done:
>
> https://lore.kernel.org/all/20250520205920.2134829-2-anthony.l.nguyen@intel.com/
> https://lore.kernel.org/all/1747950086-1246773-9-git-send-email-tariqt@nvidia.com/
>
> > It's not clear to me if the fields are being removed from struct page,
> > where are they going... the approach ptdesc for example has taken is
>
> They are going to struct net_iov.
Oh. I see. My gut reaction is I'm not sure moving the page_pool fields
to struct net_iov will work.
struct net_iov shares some fields with struct page, but abstractly
it's very different.
struct page is allocated by the mm stack via things like alloc_pages
and can be passed to mm apis such as put_page() (called from
skb_frag_ref) and vm_insert_batch (called from
tcp_zerocopy_vm_insert_batch_error).
struct net_iov is kvmalloced by networking code (see
net_devmem_bind_dmabuf for example), and *must not* be passed to any
mm apis as it's not a struct page at all. Accidentally calling
vm_insert_batch on a struct net_iov will cause a kernel crash or some
memory corruption.
Thus abstractly different things maybe should not share the same
in-kernel struct.
One thing that maybe could work is if struct net_iov has a field in it
which tells us whether it's actually a struct page that can be passed
to mm apis, or not a struct page which cannot be passed to mm apis.
> Or I should introduce another struct
maybe introducing another struct is the answer. I'm not sure. The net
stack today already supports struct page and struct net_iov, with
netmem_ref acting as an abstraction over both. Adding a 3rd struct and
adding more checks to test if page or net_iov or something new will
add overhead.
An additional problem is that there are probably hundreds or thousands
of references to 'page' in the net stack and drivers. I'm not sure
what you're going to do about those. Are you converting all those to
netmem or netmem_desc?
--
Thanks,
Mina
Powered by blists - more mailing lists