lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 23 Apr 2024 13:17:40 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Felix Fietkau <nbd@....name>
Cc: netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>, 
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, David Ahern <dsahern@...nel.org>, 
	linux-kernel@...r.kernel.org
Subject: Re: [RFC] net: add TCP fraglist GRO support

On Tue, Apr 23, 2024 at 12:25 PM Felix Fietkau <nbd@....name> wrote:
>
> On 23.04.24 12:15, Eric Dumazet wrote:
> > On Tue, Apr 23, 2024 at 11:41 AM Felix Fietkau <nbd@....name> wrote:
> >>
> >> When forwarding TCP after GRO, software segmentation is very expensive,
> >> especially when the checksum needs to be recalculated.
> >> One case where that's currently unavoidable is when routing packets over
> >> PPPoE. Performance improves significantly when using fraglist GRO
> >> implemented in the same way as for UDP.
> >>
> >> Here's a measurement of running 2 TCP streams through a MediaTek MT7622
> >> device (2-core Cortex-A53), which runs NAT with flow offload enabled from
> >> one ethernet port to PPPoE on another ethernet port + cake qdisc set to
> >> 1Gbps.
> >>
> >> rx-gro-list off: 630 Mbit/s, CPU 35% idle
> >> rx-gro-list on:  770 Mbit/s, CPU 40% idle
> >
> > Hi Felix
> >
> > changelog is a bit terse, and patch complex.
> >
> > Could you elaborate why this issue
> > seems to be related to a specific driver ?
> >
> > I think we should push hard to not use frag_list in drivers :/
> >
> > And GRO itself could avoid building frag_list skbs
> > in hosts where forwarding is enabled.
> >
> > (Note that we also can increase MAX_SKB_FRAGS to 45 these days)
>
> The issue is not related to a specific driver at all. Here's how traffic
> flows: TCP packets are received on the SoC ethernet driver, the network
> stack performs regular GRO. The packet gets forwarded by flow offloading
> until it reaches the PPPoE device. PPPoE does not support GSO packets,
> so the packets need to be segmented again.
> This is *very* expensive, since data needs to be copied and checksummed.

gso segmentation does not copy the payload, unless the device has no
SG capability.

I guess something should be done about that, regardless of your GRO work,
since most ethernet devices support SG these days.

Some drivers use header split for RX, so forwarding to  PPPoE
would require a linearization anyway, if SG is not properly handled.

>
> So in my patch, I changed the code to build fraglist GRO instead of
> regular GRO packets, whenever there is no local socket to receive the
> packets. This makes segmenting very cheap, since the original skbs are
> preserved on the trip through the stack. The only cost is an extra
> socket lookup whenever NETIF_F_FRAGLIST_GRO is enabled.

A socket lookup in multi-net-namespace world is not going to work generically,
but I get the idea now.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ