[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240528142807.903965-16-tariqt@nvidia.com>
Date: Tue, 28 May 2024 17:28:07 +0300
From: Tariq Toukan <tariqt@...dia.com>
To: "David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Eric Dumazet <edumazet@...gle.com>
CC: <netdev@...r.kernel.org>, Saeed Mahameed <saeedm@...dia.com>, Gal Pressman
<gal@...dia.com>, Leon Romanovsky <leonro@...dia.com>, Dragos Tatulea
<dtatulea@...dia.com>, Tariq Toukan <tariqt@...dia.com>
Subject: [PATCH net-next 15/15] net/mlx5e: SHAMPO, Coalesce skb fragments to page size
From: Dragos Tatulea <dtatulea@...dia.com>
When doing hardware GRO (SHAMPO), the driver puts each data payload of a
packet from the wire into one skb fragment. TCP Zero-Copy expects page
sized skb fragments to be able to do it's page-flipping magic. With the
current way of arranging fragments by the driver, only specific MTUs
(page sized multiple + header size) will yield such page sized fragments
in a high percentage.
This change improves payload arrangement in the skb for hardware GRO by
coalescing payloads into a single skb fragment when possible.
To demonstrate the fix, running tcp_mmap with a MTU of 1500 yields:
- Before: 0 % bytes mmap'ed
- After : 81 % bytes mmap'ed
More importantly, coalescing considerably improves the HW GRO performance.
Here are the results for a iperf3 bandwidth benchmark:
+---------+--------+--------+------------------------+-----------+
| streams | SW GRO | HW GRO | HW GRO with coalescing | Unit |
|---------+--------+--------+------------------------+-----------|
| 1 | 36 | 42 | 57 | Gbits/sec |
| 4 | 34 | 39 | 50 | Gbits/sec |
| 8 | 31 | 35 | 43 | Gbits/sec |
+---------+--------+--------+------------------------+-----------+
Benchmark details:
VM based setup
CPU: Intel(R) Xeon(R) Platinum 8380 CPU, 24 cores
NIC: ConnectX-7 100GbE
iperf3 and irq running on same CPU over a single receive queue
Signed-off-by: Dragos Tatulea <dtatulea@...dia.com>
Signed-off-by: Tariq Toukan <tariqt@...dia.com>
---
.../net/ethernet/mellanox/mlx5/core/en_rx.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index e6987bd467d7..54edeb8c652e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -523,15 +523,23 @@ mlx5e_add_skb_shared_info_frag(struct mlx5e_rq *rq, struct skb_shared_info *sinf
static inline void
mlx5e_add_skb_frag(struct mlx5e_rq *rq, struct sk_buff *skb,
- struct page *page, u32 frag_offset, u32 len,
+ struct mlx5e_frag_page *frag_page,
+ u32 frag_offset, u32 len,
unsigned int truesize)
{
- dma_addr_t addr = page_pool_get_dma_addr(page);
+ dma_addr_t addr = page_pool_get_dma_addr(frag_page->page);
+ u8 next_frag = skb_shinfo(skb)->nr_frags;
dma_sync_single_for_cpu(rq->pdev, addr + frag_offset, len,
rq->buff.map_dir);
- skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
- page, frag_offset, len, truesize);
+
+ if (skb_can_coalesce(skb, next_frag, frag_page->page, frag_offset)) {
+ skb_coalesce_rx_frag(skb, next_frag - 1, len, truesize);
+ } else {
+ frag_page->frags++;
+ skb_add_rx_frag(skb, next_frag, frag_page->page,
+ frag_offset, len, truesize);
+ }
}
static inline void
@@ -1956,8 +1964,7 @@ mlx5e_shampo_fill_skb_data(struct sk_buff *skb, struct mlx5e_rq *rq,
u32 pg_consumed_bytes = min_t(u32, PAGE_SIZE - data_offset, data_bcnt);
unsigned int truesize = pg_consumed_bytes;
- frag_page->frags++;
- mlx5e_add_skb_frag(rq, skb, frag_page->page, data_offset,
+ mlx5e_add_skb_frag(rq, skb, frag_page, data_offset,
pg_consumed_bytes, truesize);
data_bcnt -= pg_consumed_bytes;
--
2.31.1
Powered by blists - more mailing lists