[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250828-cpaasch-pf-927-netmlx5-avoid-copying-the-payload-to-the-malloced-area-v4-0-bfcd5033a77c@openai.com>
Date: Thu, 28 Aug 2025 20:36:17 -0700
From: Christoph Paasch via B4 Relay <devnull+cpaasch.openai.com@...nel.org>
To: Gal Pressman <gal@...dia.com>, Dragos Tatulea <dtatulea@...dia.com>,
Saeed Mahameed <saeedm@...dia.com>, Tariq Toukan <tariqt@...dia.com>,
Mark Bloch <mbloch@...dia.com>, Leon Romanovsky <leon@...nel.org>,
Andrew Lunn <andrew+netdev@...n.ch>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
Jesper Dangaard Brouer <hawk@...nel.org>,
John Fastabend <john.fastabend@...il.com>,
Stanislav Fomichev <sdf@...ichev.me>
Cc: netdev@...r.kernel.org, linux-rdma@...r.kernel.org, bpf@...r.kernel.org,
Christoph Paasch <cpaasch@...nai.com>
Subject: [PATCH net-next v4 0/2] net/mlx5: Avoid payload in skb's linear
part for better GRO-processing
When LRO is enabled on the MLX, mlx5e_skb_from_cqe_mpwrq_nonlinear
copies parts of the payload to the linear part of the skb.
This triggers suboptimal processing in GRO, causing slow throughput,...
This patch series addresses this by using eth_get_headlen to compute the
size of the protocol headers and only copy those bits. This results in
a significant throughput improvement (detailled results in the specific
patch).
Signed-off-by: Christoph Paasch <cpaasch@...nai.com>
---
Changes in v4:
- Use eth_get_headlen() instead of building a dissector based on struct mlx5_cqe64.
This mimics what other drivers,... are doing as well. (Eric Dumazet
<edumazet@...gle.com>)
- Link to v3: https://lore.kernel.org/r/20250825-cpaasch-pf-927-netmlx5-avoid-copying-the-payload-to-the-malloced-area-v3-0-5527e9eb6efc@openai.com
Changes in v3:
- Avoid computing headlen when it is not absolutely necessary (e.g., xdp
decides to "consume" the packet) (Dragos Tatulea <dtatulea@...dia.com> & Jakub Kicinski <kuba@...nel.org>)
- Given the above change, consolidate the check for min3(...) in the new
function to avoid code duplication.
- Make sure local variables are in reverse xmas-tree order.
- Refine comment about why the check for l4_type worsk as is.
- Link to v2: https://lore.kernel.org/r/20250816-cpaasch-pf-927-netmlx5-avoid-copying-the-payload-to-the-malloced-area-v2-0-b11b30bc2d10@openai.com
Changes in v2:
- Refine commit-message with more info and testing data
- Make mlx5e_cqe_get_min_hdr_len() return MLX5E_RX_MAX_HEAD when l3_type
is neither IPv4 nor IPv6. Same for the l4_type. That way behavior is
unchanged for other traffic types.
- Rename mlx5e_cqe_get_min_hdr_len to mlx5e_cqe_estimate_hdr_len
- Link to v1: https://lore.kernel.org/r/20250713-cpaasch-pf-927-netmlx5-avoid-copying-the-payload-to-the-malloced-area-v1-0-ecaed8c2844e@openai.com
---
Christoph Paasch (2):
net/mlx5: DMA-sync earlier in mlx5e_skb_from_cqe_mpwrq_nonlinear
net/mlx5: Avoid copying payload to the skb's linear part
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 24 +++++++++++++++++-------
1 file changed, 17 insertions(+), 7 deletions(-)
---
base-commit: 29828b81a46a3ae55ebc053fce512219172560ba
change-id: 20250712-cpaasch-pf-927-netmlx5-avoid-copying-the-payload-to-the-malloced-area-6524917455a6
Best regards,
--
Christoph Paasch <cpaasch@...nai.com>
Powered by blists - more mailing lists