lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Mon, 25 May 2015 16:53:43 -0700
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	"John A. Sullivan III" <jsullivan@...nsourcedevel.com>
Cc:	netdev@...r.kernel.org
Subject: Re: TCP window auto-tuning sub-optimal in GRE tunnel

On Mon, 2015-05-25 at 19:35 -0400, John A. Sullivan III wrote:
> On Mon, 2015-05-25 at 16:19 -0700, Eric Dumazet wrote:
> > On Mon, 2015-05-25 at 18:44 -0400, John A. Sullivan III wrote:
> > > On Mon, 2015-05-25 at 15:38 -0700, Eric Dumazet wrote:
> > > > On Mon, 2015-05-25 at 18:22 -0400, John A. Sullivan III wrote:
> > > > 
> > > > > 2) Why do we still not negotiate the 16MB buffer that we get when we are
> > > > > not using GRE?
> > > > 
> > > > What exact NIC handles receive side ?
> > > > 
> > > > If drivers allocate a full 4KB page to hold each frame,
> > > > plus sk_buff overhead,
> > > > then 32MB of kernel memory translates to 8MB of TCP window space.
> > > > 
> > > > 
> > > > 
> > > > 
> > > Hi, Eric.  I'm not sure I understand the question or how to obtain the
> > > information you've requested.  The receive side system has 48GB of RAM
> > > but that does not sound like what you are requesting.
> > > 
> > > I suspect the behavior is a "protection mechanism", i.e., it is being
> > > calculated for good reason.  When I set the buffer to 16MB manually in
> > > nuttcp, performance degraded so I assume I was overrunning something.  I
> > > am still downloading the traces.
> > > 
> > > But I'm still mystified by why this only affects GRE traffic.  Thanks -
> > 
> > GRE is quite expensive, some extra cpu load is needed.
> > 
> > On receiver, can you please check what exact driver is loaded ?
> > 
> > Is it igb, ixgbe, e1000e, i40e ?
> > 
> > ethtool -i eth0
> > 
> > GRE has extra 28 bytes of encapsulation, this definitely can make skb a
> > little bit fat. TCP has very simple heuristics (using power of two
> > steps) and a 50% factor can be explained by this extra 28 bytes for some
> > particular driver.
> > 
> > You could emulate this at the sender (without GRE) by reducing the mtu
> > for the route to your target.
> > 
> > ip route add 192.x.y.z via <gateway> mtu 1450
> > 
> > 
> 
> The receiver as well as the gateway is using igb:
> root@...rveringestst-01:~# ethtool -i eth0
> driver: igb
> version: 3.2.10-k
> firmware-version: 1.4-3
> bus-info: 0000:01:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> 
> Changing the MTU does not show the same degradation as GRE:

Then it is very possible igb was not able to dissect GRE packets,
and driver skb allocation enters a 'slow path'

You might try a more recent version of linux kernel at receiver.

igb current version is 5.2.15-k


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ