lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7476374f-cf0c-45d0-8100-1b2cd2f290d5@nbd.name>
Date: Tue, 23 Apr 2024 13:55:14 +0200
From: Felix Fietkau <nbd@....name>
To: Eric Dumazet <edumazet@...gle.com>
Cc: netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
 Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 David Ahern <dsahern@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [RFC] net: add TCP fraglist GRO support

On 23.04.24 13:17, Eric Dumazet wrote:
> On Tue, Apr 23, 2024 at 12:25 PM Felix Fietkau <nbd@....name> wrote:
>>
>> On 23.04.24 12:15, Eric Dumazet wrote:
>> > On Tue, Apr 23, 2024 at 11:41 AM Felix Fietkau <nbd@....name> wrote:
>> >>
>> >> When forwarding TCP after GRO, software segmentation is very expensive,
>> >> especially when the checksum needs to be recalculated.
>> >> One case where that's currently unavoidable is when routing packets over
>> >> PPPoE. Performance improves significantly when using fraglist GRO
>> >> implemented in the same way as for UDP.
>> >>
>> >> Here's a measurement of running 2 TCP streams through a MediaTek MT7622
>> >> device (2-core Cortex-A53), which runs NAT with flow offload enabled from
>> >> one ethernet port to PPPoE on another ethernet port + cake qdisc set to
>> >> 1Gbps.
>> >>
>> >> rx-gro-list off: 630 Mbit/s, CPU 35% idle
>> >> rx-gro-list on:  770 Mbit/s, CPU 40% idle
>> >
>> > Hi Felix
>> >
>> > changelog is a bit terse, and patch complex.
>> >
>> > Could you elaborate why this issue
>> > seems to be related to a specific driver ?
>> >
>> > I think we should push hard to not use frag_list in drivers :/
>> >
>> > And GRO itself could avoid building frag_list skbs
>> > in hosts where forwarding is enabled.
>> >
>> > (Note that we also can increase MAX_SKB_FRAGS to 45 these days)
>>
>> The issue is not related to a specific driver at all. Here's how traffic
>> flows: TCP packets are received on the SoC ethernet driver, the network
>> stack performs regular GRO. The packet gets forwarded by flow offloading
>> until it reaches the PPPoE device. PPPoE does not support GSO packets,
>> so the packets need to be segmented again.
>> This is *very* expensive, since data needs to be copied and checksummed.
> 
> gso segmentation does not copy the payload, unless the device has no
> SG capability.
> 
> I guess something should be done about that, regardless of your GRO work,
> since most ethernet devices support SG these days.
> 
> Some drivers use header split for RX, so forwarding to  PPPoE
> would require a linearization anyway, if SG is not properly handled.

In the world of consumer-grade WiFi devices, there are a lot of chipsets 
with limited or nonexistent SG support, and very limited checksum 
offload capabilities on Ethernet. The WiFi side of these devices is 
often even worse. I think fraglist GRO is a decent fallback for the 
inevitable corner cases.

>> So in my patch, I changed the code to build fraglist GRO instead of
>> regular GRO packets, whenever there is no local socket to receive the
>> packets. This makes segmenting very cheap, since the original skbs are
>> preserved on the trip through the stack. The only cost is an extra
>> socket lookup whenever NETIF_F_FRAGLIST_GRO is enabled.
> 
> A socket lookup in multi-net-namespace world is not going to work generically,
> but I get the idea now.

Right, I can't think of a proper solution to this at the moment. 
Considering that NETIF_F_FRAGLIST_GRO is opt-in and only meant for 
rather specific configurations anyway, this should not be too much of a 
problem, right?

- Felix

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ