netdev - Re: [PATCH net-next 2/2] net/mlx5: Avoid copying payload to the skb's linear part

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <bac466e9-c18c-4cc6-a143-4139a3395305@intel.com>
Date: Wed, 16 Jul 2025 13:42:19 +0200
From: Alexander Lobakin <aleksander.lobakin@...el.com>
To: Christoph Paasch <cpaasch@...nai.com>
CC: Saeed Mahameed <saeedm@...dia.com>, Leon Romanovsky <leon@...nel.org>,
	Tariq Toukan <tariqt@...dia.com>, Mark Bloch <mbloch@...dia.com>, Andrew Lunn
	<andrew+netdev@...n.ch>, "David S. Miller" <davem@...emloft.net>, "Eric
 Dumazet" <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
	<pabeni@...hat.com>, <linux-rdma@...r.kernel.org>, <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next 2/2] net/mlx5: Avoid copying payload to the skb's
 linear part

From: Christoph Paasch <cpaasch@...nai.com>
Date: Mon, 14 Jul 2025 15:22:34 -0700

> On Mon, Jul 14, 2025 at 7:23 AM Alexander Lobakin
> <aleksander.lobakin@...el.com> wrote:
>>
>> From: Christoph Paasch Via B4 Relay <devnull+cpaasch.openai.com@...nel.org>
>> Date: Sun, 13 Jul 2025 16:33:07 -0700
>>
>>> From: Christoph Paasch <cpaasch@...nai.com>
>>>
>>> mlx5e_skb_from_cqe_mpwrq_nonlinear() copies MLX5E_RX_MAX_HEAD (256)
>>> bytes from the page-pool to the skb's linear part. Those 256 bytes
>>> include part of the payload.
>>>
>>> When attempting to do GRO in skb_gro_receive, if headlen > data_offset
>>> (and skb->head_frag is not set), we end up aggregating packets in the
>>
>> How did you end up with ->head_frag not set? IIRC mlx5 uses
>> napi_build_skb(), which explicitly sets ->head_frag to true.
>> It should be false only for kmalloced linear parts.
> 
> This particular code-path calls napi_alloc_skb() which ends up calling
> __alloc_skb() and won't set head_frag to 1.

Hmmm. I haven't looked deep into mlx5 HW GRO internals, but
napi_alloc_skb() falls back to __alloc_skb() only in certain cases; in
most common cases, it should go "the Eric route" with allocating a small
frag for its payload.

[...]

>> (the above was correct for 2020 when I last time played with router
>>  drivers, but I hope nothing's been broken since then)
> 
> Yes, as you correctly point out, it is all about avoiding to copy any
> payload to have fast GRO.
> 
> I can give it a shot of just copying eth_hlen. And see what perf I
> get. You are probably right that it won't matter much. I just thought
> that as I have the bits in the cqe that give me some hints on what
> headers are present, I can just be slightly more efficient.

Yeah it just depends on the results. On some setups and workloads, just
copying ETH_HLEN might perform better than trying to calculate the
precise payload offset (but not always).
If you really want precise numbers, then eth_get_headlen() would do that
for you, but it introduces overhead related to Flow Dissector, so again,
only test comparison will show you.

> 
> Thanks,
> Christoph
> 
>>
>>> +
>>>       if (prog) {
>>>               /* area for bpf_xdp_[store|load]_bytes */
>>>               net_prefetchw(netmem_address(frag_page->netmem) + frag_offset);
>>
>> Thanks,
>> Olek

Thanks,
Olek