lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izOOZ5w_+5W=uYGrr3V8VjYXfo6fzs3cjJF1eoZsqcm71Q@mail.gmail.com>
Date: Mon, 14 Jul 2025 12:02:02 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Byungchul Park <byungchul@...com>
Cc: willy@...radead.org, netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
	linux-mm@...ck.org, kernel_team@...ynix.com, kuba@...nel.org, 
	ilias.apalodimas@...aro.org, harry.yoo@...cle.com, hawk@...nel.org, 
	akpm@...ux-foundation.org, davem@...emloft.net, john.fastabend@...il.com, 
	andrew+netdev@...n.ch, asml.silence@...il.com, toke@...hat.com, 
	tariqt@...dia.com, edumazet@...gle.com, pabeni@...hat.com, saeedm@...dia.com, 
	leon@...nel.org, ast@...nel.org, daniel@...earbox.net, david@...hat.com, 
	lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, vbabka@...e.cz, 
	rppt@...nel.org, surenb@...gle.com, mhocko@...e.com, horms@...nel.org, 
	linux-rdma@...r.kernel.org, bpf@...r.kernel.org, vishal.moola@...il.com, 
	hannes@...xchg.org, ziy@...dia.com, jackmanb@...gle.com
Subject: Re: [PATCH net-next v9 6/8] mlx4: use netmem descriptor and APIs for
 page pool

On Thu, Jul 10, 2025 at 6:33 PM Byungchul Park <byungchul@...com> wrote:
>
> On Thu, Jul 10, 2025 at 11:29:35AM -0700, Mina Almasry wrote:
> > On Thu, Jul 10, 2025 at 1:28 AM Byungchul Park <byungchul@...com> wrote:
> > >
> > > To simplify struct page, the effort to separate its own descriptor from
> > > struct page is required and the work for page pool is on going.
> > >
> > > Use netmem descriptor and APIs for page pool in mlx4 code.
> > >
> > > Signed-off-by: Byungchul Park <byungchul@...com>
> > > ---
> > >  drivers/net/ethernet/mellanox/mlx4/en_rx.c   | 48 +++++++++++---------
> > >  drivers/net/ethernet/mellanox/mlx4/en_tx.c   |  8 ++--
> > >  drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |  4 +-
> > >  3 files changed, 32 insertions(+), 28 deletions(-)
> > >
> > > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > index b33285d755b9..7cf0d2dc5011 100644
> > > --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > @@ -62,18 +62,18 @@ static int mlx4_en_alloc_frags(struct mlx4_en_priv *priv,
> > >         int i;
> > >
> > >         for (i = 0; i < priv->num_frags; i++, frags++) {
> > > -               if (!frags->page) {
> > > -                       frags->page = page_pool_alloc_pages(ring->pp, gfp);
> > > -                       if (!frags->page) {
> > > +               if (!frags->netmem) {
> > > +                       frags->netmem = page_pool_alloc_netmems(ring->pp, gfp);
> > > +                       if (!frags->netmem) {
> > >                                 ring->alloc_fail++;
> > >                                 return -ENOMEM;
> > >                         }
> > > -                       page_pool_fragment_page(frags->page, 1);
> > > +                       page_pool_fragment_netmem(frags->netmem, 1);
> > >                         frags->page_offset = priv->rx_headroom;
> > >
> > >                         ring->rx_alloc_pages++;
> > >                 }
> > > -               dma = page_pool_get_dma_addr(frags->page);
> > > +               dma = page_pool_get_dma_addr_netmem(frags->netmem);
> > >                 rx_desc->data[i].addr = cpu_to_be64(dma + frags->page_offset);
> > >         }
> > >         return 0;
> > > @@ -83,10 +83,10 @@ static void mlx4_en_free_frag(const struct mlx4_en_priv *priv,
> > >                               struct mlx4_en_rx_ring *ring,
> > >                               struct mlx4_en_rx_alloc *frag)
> > >  {
> > > -       if (frag->page)
> > > -               page_pool_put_full_page(ring->pp, frag->page, false);
> > > +       if (frag->netmem)
> > > +               page_pool_put_full_netmem(ring->pp, frag->netmem, false);
> > >         /* We need to clear all fields, otherwise a change of priv->log_rx_info
> > > -        * could lead to see garbage later in frag->page.
> > > +        * could lead to see garbage later in frag->netmem.
> > >          */
> > >         memset(frag, 0, sizeof(*frag));
> > >  }
> > > @@ -440,29 +440,33 @@ static int mlx4_en_complete_rx_desc(struct mlx4_en_priv *priv,
> > >         unsigned int truesize = 0;
> > >         bool release = true;
> > >         int nr, frag_size;
> > > -       struct page *page;
> > > +       netmem_ref netmem;
> > >         dma_addr_t dma;
> > >
> > >         /* Collect used fragments while replacing them in the HW descriptors */
> > >         for (nr = 0;; frags++) {
> > >                 frag_size = min_t(int, length, frag_info->frag_size);
> > >
> > > -               page = frags->page;
> > > -               if (unlikely(!page))
> > > +               netmem = frags->netmem;
> > > +               if (unlikely(!netmem))
> > >                         goto fail;
> > >
> > > -               dma = page_pool_get_dma_addr(page);
> > > +               dma = page_pool_get_dma_addr_netmem(netmem);
> > >                 dma_sync_single_range_for_cpu(priv->ddev, dma, frags->page_offset,
> > >                                               frag_size, priv->dma_dir);
> > >
> > > -               __skb_fill_page_desc(skb, nr, page, frags->page_offset,
> > > -                                    frag_size);
> > > +               __skb_fill_netmem_desc(skb, nr, netmem, frags->page_offset,
> > > +                                      frag_size);
> > >
> > >                 truesize += frag_info->frag_stride;
> > >                 if (frag_info->frag_stride == PAGE_SIZE / 2) {
> > > +                       struct page *page = netmem_to_page(netmem);
> >
> > This cast is not safe, try to use the netmem type directly.
>
> Can it be net_iov?  It already ensures it's a page-backed netmem.  Why
> is that unsafe?
>

Precisely because it can be net_iov. The whole point of netmem_ref is
that it's an abstract type that can be something else underneath.
Converting the driver to netmem only to then say "well, the netmem is
just pages, cast it back" is just adding a lot of useless typecasting
and wasted cpu cycles with no benefit. In general netmem_to_page casts
in drivers are heavily discouraged.

> With netmem, page_count() and page_to_nid() cannot be used, but needed.
> Or checking 'page == NULL' after the casting works for you?
>

I'm really hoping that you can read the code and be a bit familiar
with the driver that you're modifying and suggest a solution that
works for us. netmem_to_page() followed by page_count() without a NULL
check crashes the kernel if the netmem is not a page underneath... so
it shouldn't work for anyone.

What may be acceptable here is netmem_count helper and what not like
we have netmem_is_pfmemalloc which handles net_iov correctly.

However, it looks like this driver is trying to check if the netmem is
recyclable, we have a helper like that already in
__page_pool_page_can_be_recycled. Maybe that can be reused somehow.

-- 
Thanks,
Mina

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ