[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c433e59d-1262-656e-b08c-30adfc12edcc@gmail.com>
Date: Wed, 12 Apr 2023 09:41:16 +0300
From: Tariq Toukan <ttoukan.linux@...il.com>
To: Jakub Kicinski <kuba@...nel.org>, davem@...emloft.net
Cc: netdev@...r.kernel.org, edumazet@...gle.com, pabeni@...hat.com,
hawk@...nel.org, ilias.apalodimas@...aro.org,
linyunsheng@...wei.com, Dragos Tatulea <dtatulea@...dia.com>
Subject: Re: [PATCH net-next 0/3] page_pool: allow caching from safely
localized NAPI
Hi,
Happy holidays! I was traveling and couldn't review the RFCs.
Dragos and I had this task in our plans after our recent page pool work
in mlx5e. We're so glad to see this patchset! :)
On 11/04/2023 23:17, Jakub Kicinski wrote:
> I went back to the explicit "are we in NAPI method", mostly
> because I don't like having both around :( (even tho I maintain
> that in_softirq() && !in_hardirq() is as safe, as softirqs do
> not nest).
>
> Still returning the skbs to a CPU, tho, not to the NAPI instance.
> I reckon we could create a small refcounted struct per NAPI instance
> which would allow sockets and other users so hold a persisent
> and safe reference. But that's a bigger change, and I get 90+%
> recycling thru the cache with just these patches (for RR and
> streaming tests with 100% CPU use it's almost 100%).
>
> Some numbers for streaming test with 100% CPU use (from previous version,
> but really they perform the same):
>
> HW-GRO page=page
> before after before after
> recycle:
> cached: 0 138669686 0 150197505
> cache_full: 0 223391 0 74582
> ring: 138551933 9997191 149299454 0
> ring_full: 0 488 3154 127590
> released_refcnt: 0 0 0 0
>
Impressive.
Dragos tested your RFC v1.
He can test this one as well, expecting the same effect.
> alloc:
> fast: 136491361 148615710 146969587 150322859
> slow: 1772 1799 144 105
> slow_high_order: 0 0 0 0
> empty: 1772 1799 144 105
> refill: 2165245 156302 2332880 2128
> waive: 0 0 0 0
>
General note:
For fragmented page-pool pages, the decision to whether go though the
cache or the ring depends only on the last releasing thread and its context.
Our in-driver deferred release trick is in many cases beneficial even in
the presence of this idea. The sets of cases improved by each idea
intersect, but are not completely identical.
That's why we decided to go with both solutions working together, and
not only one.
> v1:
> - rename the arg in_normal_napi -> napi_safe
> - also allow recycling in __kfree_skb_defer()
> rfcv2: https://lore.kernel.org/all/20230405232100.103392-1-kuba@kernel.org/
>
> Jakub Kicinski (3):
> net: skb: plumb napi state thru skb freeing paths
> page_pool: allow caching from safely localized NAPI
> bnxt: hook NAPIs to page pools
>
> Documentation/networking/page_pool.rst | 1 +
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 1 +
> include/linux/netdevice.h | 3 ++
> include/linux/skbuff.h | 20 +++++++----
> include/net/page_pool.h | 3 +-
> net/core/dev.c | 3 ++
> net/core/page_pool.c | 15 ++++++--
> net/core/skbuff.c | 42 ++++++++++++-----------
> 8 files changed, 58 insertions(+), 30 deletions(-)
>
Powered by blists - more mailing lists