lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250725065038.GA58004@system.software.com>
Date: Fri, 25 Jul 2025 15:50:38 +0900
From: Byungchul Park <byungchul@...com>
To: Mina Almasry <almasrymina@...gle.com>
Cc: linux-mm@...ck.org, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, kernel_team@...ynix.com,
	harry.yoo@...cle.com, ast@...nel.org, daniel@...earbox.net,
	davem@...emloft.net, kuba@...nel.org, hawk@...nel.org,
	john.fastabend@...il.com, sdf@...ichev.me, saeedm@...dia.com,
	leon@...nel.org, tariqt@...dia.com, mbloch@...dia.com,
	andrew+netdev@...n.ch, edumazet@...gle.com, pabeni@...hat.com,
	akpm@...ux-foundation.org, david@...hat.com,
	lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, vbabka@...e.cz,
	rppt@...nel.org, surenb@...gle.com, mhocko@...e.com,
	horms@...nel.org, jackmanb@...gle.com, hannes@...xchg.org,
	ziy@...dia.com, ilias.apalodimas@...aro.org, willy@...radead.org,
	brauner@...nel.org, kas@...nel.org, yuzhao@...gle.com,
	usamaarif642@...il.com, baolin.wang@...ux.alibaba.com,
	toke@...hat.com, asml.silence@...il.com, bpf@...r.kernel.org,
	linux-rdma@...r.kernel.org
Subject: Re: [PATCH] mm, page_pool: introduce a new page type for page pool
 in page type

On Tue, Jul 22, 2025 at 03:17:15PM -0700, Mina Almasry wrote:
> On Sun, Jul 20, 2025 at 10:49 PM Byungchul Park <byungchul@...com> wrote:
> > diff --git a/net/core/netmem_priv.h b/net/core/netmem_priv.h
> > index cd95394399b4..39a97703d9ed 100644
> > --- a/net/core/netmem_priv.h
> > +++ b/net/core/netmem_priv.h
> > @@ -8,21 +8,11 @@ static inline unsigned long netmem_get_pp_magic(netmem_ref netmem)
> >         return __netmem_clear_lsb(netmem)->pp_magic & ~PP_DMA_INDEX_MASK;
> >  }
> >
> > -static inline void netmem_or_pp_magic(netmem_ref netmem, unsigned long pp_magic)
> > -{
> > -       __netmem_clear_lsb(netmem)->pp_magic |= pp_magic;
> > -}
> > -
> > -static inline void netmem_clear_pp_magic(netmem_ref netmem)
> > -{
> > -       WARN_ON_ONCE(__netmem_clear_lsb(netmem)->pp_magic & PP_DMA_INDEX_MASK);
> > -
> > -       __netmem_clear_lsb(netmem)->pp_magic = 0;
> > -}
> > -
> >  static inline bool netmem_is_pp(netmem_ref netmem)
> >  {
> > -       return (netmem_get_pp_magic(netmem) & PP_MAGIC_MASK) == PP_SIGNATURE;
> > +       if (netmem_is_net_iov(netmem))
> > +               return true;
> 
> As Pavel alludes, this is dubious, and at least it's difficult to
> reason about it.
> 
> There could be net_iovs that are not attached to pp, and should not be
> treated as pp memory. These are in the devmem (and future net_iov) tx
> paths.
> 
> We need a way to tell if a net_iov is pp or not. A couple of options:
> 
> 1. We could have it such that if net_iov->pp is set, then the
> netmem_is_pp == true, otherwise false.
> 2. We could implement a page-flags equivalent for net_iov.
> 
> Option #1 is simpler and is my preferred. To do that properly, you need to:
> 
> 1. Make sure everywhere net_iovs are allocated that pp=NULL in the
> non-pp case and pp=non NULL in the pp case. those callsites are
> net_devmem_bind_dmabuf (devmem rx & tx path), io_zcrx_create_area

A few seconds reviewing the code, fortunately netmem_set_pp(pool) and
netmem_or_pp_magic(PP_SIGNATURE) are always called paired, and
netmem_set_pp(NULL) and netmem_clear_pp_magic() are always called paired
too.

And there's no code to directly assign a value to ->pp and ->pp_magic,
except in net_devmem_alloc_dmabuf() but that is also safe because always
followed by page_pool_set_pp_info().

Even though I think it's already equivalent between checking
'->pp != NULL' and '->pp_magic == PP_SIGNATURE' with the current code,
more consideration for better code should be always welcome.

As you mentioned, at net_devmem_bind_dmabuf() and io_zcrx_create_area(),
it'd better initialize ->pp and ->pp_magic like:

--
diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 00d0064b22a5..8f2051b2c505 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -430,6 +430,7 @@ static int io_zcrx_create_area(struct io_zcrx_ifq *ifq,
 		area->freelist[i] = i;
 		atomic_set(&area->user_refs[i], 0);
 		niov->type = NET_IOV_IOURING;
+		page_pool_clear_pp_info(net_iov_to_netmem(niov));
 	}
 
 	area->free_count = nr_iovs;
diff --git a/net/core/devmem.c b/net/core/devmem.c
index b3a62ca0df65..5d017c9f4986 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -285,6 +285,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
 			niov = &owner->area.niovs[i];
 			niov->type = NET_IOV_DMABUF;
 			niov->owner = &owner->area;
+			page_pool_clear_pp_info(net_iov_to_netmem(niov));
 			page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov),
 						      net_devmem_get_dma_addr(niov));
 			if (direction == DMA_TO_DEVICE)
--

Do you think it works for using ->pp to check if a niov is pp?

	Byungchul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ