netdev - Re: [RFC net-next 1/2] page_pool: allow caching from safely localized NAPI

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230331151712.2ccd8317@kernel.org>
Date:   Fri, 31 Mar 2023 15:17:12 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Jesper Dangaard Brouer <jbrouer@...hat.com>
Cc:     davem@...emloft.net, brouer@...hat.com, netdev@...r.kernel.org,
        edumazet@...gle.com, pabeni@...hat.com, hawk@...nel.org,
        ilias.apalodimas@...aro.org
Subject: Re: [RFC net-next 1/2] page_pool: allow caching from safely
 localized NAPI

On Fri, 31 Mar 2023 12:06:43 -0700 Jakub Kicinski wrote:
> > Make sense, but do read the comment above struct pp_alloc_cache.
> > The sizing of pp_alloc_cache is important for this trick/heuristic to
> > work, meaning the pp_alloc_cache have enough room.
> > It is definitely on purpose that pp_alloc_cache have 128 elements and is
> > only refill'ed with 64 elements, which leaves room for this kind of
> > trick.  But if Eric's deferred skb freeing have more than 64 pages to
> > free, then we will likely fallback to ptr_ring recycling.
> > 
> > Code wise, I suggest that you/we change page_pool_put_page_bulk() to
> > have a variant that 'allow_direct' (looking at code below, you might
> > already do this as this patch over-steer 'allow_direct').  Using the
> > bulk API, would then bulk into ptr_ring in the cases we cannot use
> > direct cache.  
> 
> Interesting point, let me re-run some tests with the statistics enabled.
> For a simple stream test I think it may just be too steady to trigger
> over/underflow. Each skb will carry at most 18 pages, and driver should
> only produce 64 packets / consume 64 pages. Each NAPI cycle will start
> by flushing the deferred free. So unless there is a hiccup either at the
> app or NAPI side - the flows of pages in each direction should be steady
> enough to do well with just 128 cache entries. Let me get the data and
> report back.

I patched the driver a bit to use page pool for HW-GRO.
Below are recycle stats with HW-GRO and with SW GRO + XDP_PASS (packet per page).

		HW-GRO				page=page
		before		after		before		after
recycle:
cached:			0	138669686		0	150197505
cache_full:		0	   223391		0	    74582
ring:		138551933         9997191	149299454		0
ring_full: 		0             488	     3154	   127590
released_refcnt:	0		0		0		0

alloc:
fast:		136491361	148615710	146969587	150322859
slow:		     1772	     1799	      144	      105
slow_high_order:	0		0		0		0
empty:		     1772	     1799	      144	      105
refill:		  2165245	   156302	  2332880	     2128
waive:			0		0		0		0

Which seems to confirm that for this trivial test the cache sizing is
good enough, and I won't see any benefit from batching (as cache is only
full respectively 0.16% and 0.05% of the time).