lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 3 Dec 2013 16:44:35 -0800 (PST)
From:	Joseph Gasparakis <joseph.gasparakis@...el.com>
To:	Joseph Gasparakis <joseph.gasparakis@...el.com>
cc:	Or Gerlitz <or.gerlitz@...il.com>,
	Eric Dumazet <eric.dumazet@...il.com>,
	Jerry Chu <hkchu@...gle.com>,
	Or Gerlitz <ogerlitz@...lanox.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Alexei Starovoitov <ast@...mgrid.com>,
	Pravin B Shelar <pshelar@...ira.com>,
	David Miller <davem@...emloft.net>,
	netdev <netdev@...r.kernel.org>
Subject: Re: vxlan/veth performance issues on net.git + latest kernels



On Tue, 3 Dec 2013, Joseph Gasparakis wrote:

> 
> 
> On Tue, 3 Dec 2013, Or Gerlitz wrote:
> 
> > On Wed, Dec 4, 2013 at 1:13 AM, Joseph Gasparakis
> > <joseph.gasparakis@...el.com> wrote:
> > >
> > >
> > > On Tue, 3 Dec 2013, Or Gerlitz wrote:
> > >
> > >> On Tue, Dec 3, 2013 at 11:11 PM, Joseph Gasparakis
> > >> <joseph.gasparakis@...el.com> wrote:
> > >>
> > >> >>> lack of GRO : receiver seems to not be able to receive as fast as you want.
> > >> >>>>      TCPOFOQueue: 3167879
> > >> >>> So many packets are received out of order (because of losses)
> > >>
> > >> >> I see that there's no GRO also for the non-veth tests which involve
> > >> >> vxlan, and over there the receiving side is capable to consume the
> > >> >> packets, do you have rough explaination why adding veth to the chain
> > >> >> is such game changer which makes things to start falling out?
> > >>
> > >> > I have seen this before. Here are my findings:
> > >> >
> > >> > The gso_type is different if the skb comes from veth or not. From veth,
> > >> > you will see the SKB_GSO_DODGY set. This breaks things as when the
> > >> > skb with DODGY set moves from vxlan to the driver through dev_xmit_hard,
> > >> > the stack drops it silently. I never got the time to find the root cause
> > >> > for this, but I know it causes re-transmissions and big performance
> > >> > degregation.
> > >> >
> > >> > I went as far as just quickly hacking a one liner unsetting the DODGY bit
> > >> > in vxlan.c and that bypassed the issue and recovered the performance
> > >> > problem, but obviously this is not a real fix.
> > >>
> > >> thanks for the heads up, few quick questions/clafications --
> > >>
> > >> -- you are talking on drops done @ the sender side, correct? Eric was
> > >> saying we have evidences that the drops happen on the receiver.
> > >
> > > I am *guessing* drops on the Rx are due to the drops at the Tx. See my
> > > answer to your next question for more info.
> > >
> > >>
> > >> -- without the hack you did, still packets are sent/received, so what
> > >> makes the stack to drop only some of them?
> > >>
> > >
> > > What I had seen is GSOs getting dropped on the Tx side. Basically the GSOs
> > > never made it to the driver, they were broken into non GSO smaller skbs by
> > > the stack. I think the stack is not handling well the GSO with the DODGY
> > > bit set, and that causes it to maybe partially the packet to be emitted,
> > > causing the re-transmits (and maybe the drops on your Rx end)? Of course
> > > all this is speculation, the fact that I know is that as soon as I was
> > > forcing the gso type I saw offloaded VXLAN encapsulated traffic at decent speeds.
> > >
> > >> -- why packets coming from veth would have the SKB_GSO_DODGY bit set?
> > >
> > > That is something I would love to know too. I am guessing this is a way
> > > for the VM to say it is a non-trusted packet? And maybe all this can be
> > > fixed by maybe setting something on the VM through a userspace tool that
> > > will stop the veth to set the DODGY bit?
> > >
> > >>
> > >> -- so where is now (say net.git or 3.12.x) this one line you commented
> > >> out? I don't see in vxlan.c or in ip_tunnel_core.c / ip_tunnel.c
> > >> explicit setting of SKB_GSO_DODGY
> > >
> > > I did not commit it, as this was just a workaround to prove to myself that
> > > the problem I was seing was due to the gso_type, and it would actually
> > > just hide the problem and not give a proper solution to it.
> > >
> > >>
> > >> Also, I am pretty sure the problem exists also when sending/receiving
> > >> guest traffic through tap/macvtap <--> vhost/virtio-net and friends, I
> > >> just sticked to the veth flavour b/c its one (== the hypervisor)
> > >> network stack to debug and not two (+ the guest one).
> > 
> > understood, can you point the line/area you hacked, I'd like to try it
> > too and see the impact
> 
> I was printing the gso_type in vxlan_xmit_skb(), right before 
> iptunnel_xmit() gets called (I was focus UDPv4 encap only). Then I saw the 
> gso_type was different when a VM was involved and when it was not 
> (although I was transmitting exactly the same packet), and then I replaced 
> my printk with something like skb_shinfo(skb)->gso_type = <the gso type I had
> for non-VM skb> and it all worked.
> 
> Then I looked into what was different between the two gso_types and the 
> only difference was that SKB_GSO_DODGY was set when Tx'ing from the VM.
> I am sure I could have been more delicate with the aproach, but hey, it
> worked for me.
> 
> I would be curious to see if this is the same issue as mine. It seems like 
> it is.
>

Oh, and if I remember correctly, gso_type without VMs involved was 129 
(SKB_GSO_UDP_TUNNEL | SKB_GSO_TCPV4) and with VM it was 133 
(SKB_GSO_UDP_TUNNEL | SKB_GSO_DODGY | SKB_GSO_TCPV4).

> > 
> > >> --
> > >> To unsubscribe from this list: send the line "unsubscribe netdev" in
> > >> the body of a message to majordomo@...r.kernel.org
> > >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >>
> > 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ