lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <ffef54f4-4e4c-9125-6e3a-fc566d2869ec@cumulusnetworks.com> Date: Mon, 8 Aug 2016 11:48:55 -0600 From: David Ahern <dsa@...ulusnetworks.com> To: Lennert Buytenhek <buytenh@...tstofly.org>, Roopa Prabhu <roopa@...ulusnetworks.com>, Robert Shearman <rshearma@...cade.com> Cc: Alexander Duyck <aduyck@...antis.com>, netdev@...r.kernel.org Subject: Re: problem with MPLS and TSO/GSO On 7/25/16 10:39 AM, Lennert Buytenhek wrote: > Hi! > > I am seeing pretty horrible TCP transmit performance (anywhere between > 1 and 10 Mb/s, on a 10 Gb/s interface) when traffic is sent out over a > route that involves MPLS labeling, and this seems to be due to an > interaction between MPLS and TSO/GSO that causes all segmentable TCP > frames that are MPLS-labeled to be dropped on egress. > > I initially ran into this issue with the ixgbe driver, but it is easily > reproduced with veth interfaces, and the script attached below this > email reproduces the issue. The script configures three network > namespaces: one that transmits TCP data (netperf) with MPLS labels, > one that takes the MPLS traffic and pops the labels and forwards the > traffic on, and one that receives the traffic (netserver). When not > using MPLS labeling, I get ~30000 Mb/s single-stream TCP performance > in this setup on my test box, and with MPLS labeling, I get ~2 Mb/s. > > Some investigating shows that egress TCP frames that need to be > segmented are being dropped in validate_xmit_skb(), which calls > skb_gso_segment() which calls skb_mac_gso_segment() which returns > -EPROTONOSUPPORT because we apparently didn't have the right kernel > module (mpls_gso) loaded. > > (It's somewhat poor design, IMHO, to degrade network performance by > 15000x if someone didn't load a kernel module they didn't know they > should have loaded, and in a way that doesn't log any warnings or > errors and can only be diagnosed by adding printk calls to net/core/ > and recompiling your kernel.) > > (Also, I'm not sure why mpls_gso is needed when ixgbe seems to be > able to natively do TSO on MPLS-labeled traffic, maybe because ixgbe > doesn't advertise the necessary features in ->mpls_features? But > adding those bits doesn't seem to change much.) > > But, loading mpls_gso doesn't change much -- skb_gso_segment() then > starts return -EINVAL instead, which is due to the > skb_network_protocol() call in skb_mac_gso_segment() returning zero. > And looking at skb_network_protocol(), I don't see how this is > supposed to work -- skb->protocol is 0 at this point, and there is no > way to figure out that what we are encapsulating is IP traffic, because > unlike what is the case with VLAN tags, MPLS labels aren't followed by > an inner ethertype that says what kind of traffic is in here, you have > to have explicit knowledge of the payload type for MPLS. > > Any ideas? Something is up with the skb manipulations or settings by mpls. With the inner protocol set in mpls_output: skb_set_inner_protocol(skb, skb->protocol); I get EINVAL failures from inet_gso_segment because the iphdr is not proper (ihl is 0 and version is 0). Thanks for the script to repro with namespaces; much simpler to debug.
Powered by blists - more mailing lists