lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 5 Apr 2019 10:12:35 +0200 From: Rafał Miłecki <zajec5@...il.com> To: Toshiaki Makita <makita.toshiaki@....ntt.co.jp>, Felix Fietkau <nbd@....name> Cc: Toshiaki Makita <toshiaki.makita1@...il.com>, netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>, Stefano Brivio <sbrivio@...hat.com>, Sabrina Dubroca <sd@...asysnail.net>, David Ahern <dsahern@...il.com>, Jo-Philipp Wich <jo@...n.io>, Koen Vandeputte <koen.vandeputte@...ntric.com> Subject: Re: NAT performance regression caused by vlan GRO support On 05.04.2019 09:58, Toshiaki Makita wrote: > On 2019/04/05 16:14, Felix Fietkau wrote: >> On 2019-04-05 09:11, Rafał Miłecki wrote: >>> I guess its GRO + csum_partial() to be blamed for this performance drop. >>> >>> Maybe csum_partial() is very fast on your powerful machine and few extra calls >>> don't make a difference? I can imagine it affecting much slower home router with >>> ARM cores. >> Most high performance Ethernet devices implement hardware checksum >> offload, which completely gets rid of this overhead. >> Unfortunately, the BCM53xx/47xx Ethernet MAC doesn't have this, which is >> why you're getting such crappy performance. > > Hmm... now I disabled rx checksum and tried the test again, and indeed I > see csum_partial from GRO path. But I also see csum_partial even without > GRO from nf_conntrack_in -> tcp_packet -> __skb_checksum_complete. > Probably Rafał disabled nf_conntrack_checksum sysctl knob? > > But anyway even with disabling rx csum offload my machine has better > performance with GRO. I'm sure in some cases GRO should be disabled, but > I guess it's difficult to determine whether we should disable GRO or not > automatically when csum offload is not available. Few testing results: 1) ethtool -K eth0 gro off; echo 0 > /proc/sys/net/netfilter/nf_conntrack_checksum [ 6] 0.0-60.0 sec 6.57 GBytes 940 Mbits/sec 2) ethtool -K eth0 gro off; echo 1 > /proc/sys/net/netfilter/nf_conntrack_checksum [ 6] 0.0-60.0 sec 4.65 GBytes 666 Mbits/sec 3) ethtool -K eth0 gro on; echo 0 > /proc/sys/net/netfilter/nf_conntrack_checksum [ 6] 0.0-60.0 sec 4.02 GBytes 575 Mbits/sec 4) ethtool -K eth0 gro on; echo 1 > /proc/sys/net/netfilter/nf_conntrack_checksum [ 6] 0.0-60.0 sec 4.04 GBytes 579 Mbits/sec
Powered by blists - more mailing lists