[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210604183349.30040-5-mcroce@linux.microsoft.com>
Date: Fri, 4 Jun 2021 20:33:48 +0200
From: Matteo Croce <mcroce@...ux.microsoft.com>
To: netdev@...r.kernel.org, linux-mm@...ck.org
Cc: Ayush Sawal <ayush.sawal@...lsio.com>,
Vinay Kumar Yadav <vinay.yadav@...lsio.com>,
Rohit Maheshwari <rohitm@...lsio.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
Marcin Wojtas <mw@...ihalf.com>,
Russell King <linux@...linux.org.uk>,
Mirko Lindner <mlindner@...vell.com>,
Stephen Hemminger <stephen@...workplumber.org>,
Tariq Toukan <tariqt@...dia.com>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
John Fastabend <john.fastabend@...il.com>,
Boris Pismenny <borisp@...dia.com>,
Arnd Bergmann <arnd@...db.de>,
Andrew Morton <akpm@...ux-foundation.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Vlastimil Babka <vbabka@...e.cz>, Yu Zhao <yuzhao@...gle.com>,
Will Deacon <will@...nel.org>,
Fenghua Yu <fenghua.yu@...el.com>,
Roman Gushchin <guro@...com>, Hugh Dickins <hughd@...gle.com>,
Peter Xu <peterx@...hat.com>, Jason Gunthorpe <jgg@...pe.ca>,
Jonathan Lemon <jonathan.lemon@...il.com>,
Alexander Lobakin <alobakin@...me>,
Cong Wang <cong.wang@...edance.com>, wenxu <wenxu@...oud.cn>,
Kevin Hao <haokexin@...il.com>,
Jakub Sitnicki <jakub@...udflare.com>,
Marco Elver <elver@...gle.com>,
Willem de Bruijn <willemb@...gle.com>,
Miaohe Lin <linmiaohe@...wei.com>,
Yunsheng Lin <linyunsheng@...wei.com>,
Guillaume Nault <gnault@...hat.com>,
linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
bpf@...r.kernel.org, Matthew Wilcox <willy@...radead.org>,
Eric Dumazet <edumazet@...gle.com>,
David Ahern <dsahern@...il.com>,
Lorenzo Bianconi <lorenzo@...nel.org>,
Saeed Mahameed <saeedm@...dia.com>,
Andrew Lunn <andrew@...n.ch>, Paolo Abeni <pabeni@...hat.com>,
Sven Auhagen <sven.auhagen@...eatech.de>
Subject: [PATCH net-next v7 4/5] mvpp2: recycle buffers
From: Matteo Croce <mcroce@...rosoft.com>
Use the new recycling API for page_pool.
In a drop rate test, the packet rate is almost doubled,
from 1110 Kpps to 2128 Kpps.
perf top on a stock system shows:
Overhead Shared Object Symbol
34.88% [kernel] [k] page_pool_release_page
8.06% [kernel] [k] free_unref_page
6.42% [mvpp2] [k] mvpp2_rx
6.07% [kernel] [k] eth_type_trans
5.18% [kernel] [k] __netif_receive_skb_core
4.95% [kernel] [k] build_skb
4.88% [kernel] [k] kmem_cache_free
3.97% [kernel] [k] kmem_cache_alloc
3.45% [kernel] [k] dev_gro_receive
2.73% [kernel] [k] page_frag_free
2.07% [kernel] [k] __alloc_pages_bulk
1.99% [kernel] [k] arch_local_irq_save
1.84% [kernel] [k] skb_release_data
1.20% [kernel] [k] netif_receive_skb_list_internal
With packet rate stable at 1100 Kpps:
tx: 0 bps 0 pps rx: 532.7 Mbps 1110 Kpps
tx: 0 bps 0 pps rx: 532.6 Mbps 1110 Kpps
tx: 0 bps 0 pps rx: 532.4 Mbps 1109 Kpps
tx: 0 bps 0 pps rx: 532.1 Mbps 1109 Kpps
tx: 0 bps 0 pps rx: 531.9 Mbps 1108 Kpps
tx: 0 bps 0 pps rx: 531.9 Mbps 1108 Kpps
And this is the same output with recycling enabled:
Overhead Shared Object Symbol
12.91% [kernel] [k] eth_type_trans
12.54% [mvpp2] [k] mvpp2_rx
9.67% [kernel] [k] build_skb
9.63% [kernel] [k] __netif_receive_skb_core
8.44% [kernel] [k] page_pool_put_page
8.07% [kernel] [k] kmem_cache_free
7.79% [kernel] [k] kmem_cache_alloc
6.86% [kernel] [k] dev_gro_receive
3.19% [kernel] [k] skb_release_data
2.41% [kernel] [k] netif_receive_skb_list_internal
2.18% [kernel] [k] page_pool_refill_alloc_cache
1.76% [kernel] [k] napi_gro_receive
1.61% [kernel] [k] kfree_skb
1.20% [kernel] [k] dma_sync_single_for_device
1.16% [mvpp2] [k] mvpp2_poll
1.12% [mvpp2] [k] mvpp2_read
With packet rate above 2100 Kpps:
tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1021 Mbps 2127 Kpps
tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1022 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1022 Mbps 2129 Kpps
The major performance increase is explained by the fact that the most CPU
consuming functions (page_pool_release_page, page_frag_free and
free_unref_page) are no longer called on a per packet basis.
The test was done by sending to the macchiatobin 64 byte ethernet frames
with an invalid ethertype, so the packets are dropped early in the RX path.
Signed-off-by: Matteo Croce <mcroce@...rosoft.com>
---
drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index d4fb620f53f3..b1d186abcc6c 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -3997,7 +3997,7 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
}
if (pp)
- page_pool_release_page(pp, virt_to_page(data));
+ skb_mark_for_recycle(skb, virt_to_page(data), pp);
else
dma_unmap_single_attrs(dev->dev.parent, dma_addr,
bm_pool->buf_size, DMA_FROM_DEVICE,
--
2.31.1
Powered by blists - more mailing lists