[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZOPZL9LxB5HdqS=+gQW-MjvW2NsZOt3NPB7v1iOFoEB7SsMQ@mail.gmail.com>
Date: Tue, 3 Dec 2013 23:09:39 +0200
From: Or Gerlitz <or.gerlitz@...il.com>
To: Joseph Gasparakis <joseph.gasparakis@...el.com>
Cc: Eric Dumazet <eric.dumazet@...il.com>,
Jerry Chu <hkchu@...gle.com>,
Or Gerlitz <ogerlitz@...lanox.com>,
Eric Dumazet <edumazet@...gle.com>,
Alexei Starovoitov <ast@...mgrid.com>,
Pravin B Shelar <pshelar@...ira.com>,
David Miller <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>
Subject: Re: vxlan/veth performance issues on net.git + latest kernels
On Tue, Dec 3, 2013 at 11:11 PM, Joseph Gasparakis
<joseph.gasparakis@...el.com> wrote:
>>> lack of GRO : receiver seems to not be able to receive as fast as you want.
>>>> TCPOFOQueue: 3167879
>>> So many packets are received out of order (because of losses)
>> I see that there's no GRO also for the non-veth tests which involve
>> vxlan, and over there the receiving side is capable to consume the
>> packets, do you have rough explaination why adding veth to the chain
>> is such game changer which makes things to start falling out?
> I have seen this before. Here are my findings:
>
> The gso_type is different if the skb comes from veth or not. From veth,
> you will see the SKB_GSO_DODGY set. This breaks things as when the
> skb with DODGY set moves from vxlan to the driver through dev_xmit_hard,
> the stack drops it silently. I never got the time to find the root cause
> for this, but I know it causes re-transmissions and big performance
> degregation.
>
> I went as far as just quickly hacking a one liner unsetting the DODGY bit
> in vxlan.c and that bypassed the issue and recovered the performance
> problem, but obviously this is not a real fix.
thanks for the heads up, few quick questions/clafications --
-- you are talking on drops done @ the sender side, correct? Eric was
saying we have evidences that the drops happen on the receiver.
-- without the hack you did, still packets are sent/received, so what
makes the stack to drop only some of them?
-- why packets coming from veth would have the SKB_GSO_DODGY bit set?
-- so where is now (say net.git or 3.12.x) this one line you commented
out? I don't see in vxlan.c or in ip_tunnel_core.c / ip_tunnel.c
explicit setting of SKB_GSO_DODGY
Also, I am pretty sure the problem exists also when sending/receiving
guest traffic through tap/macvtap <--> vhost/virtio-net and friends, I
just sticked to the veth flavour b/c its one (== the hypervisor)
network stack to debug and not two (+ the guest one).
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists