[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50656A90.5030503@googlemail.com>
Date: Fri, 28 Sep 2012 10:14:56 +0100
From: Chris Clayton <chris2553@...glemail.com>
To: David Miller <davem@...emloft.net>
CC: eric.dumazet@...il.com, netdev@...r.kernel.org, gpiez@....de
Subject: Re: Possible networking regression in 3.6.0
On 09/28/12 07:53, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@...il.com>
> Date: Thu, 27 Sep 2012 23:17:04 +0200
>
>> Yes it seems the problem. On the host I tried :
>>
>> # ip ro get 8.8.8.8 from 192.168.200.1 iif tap1
>> 8.8.8.8 from 192.168.200.1 via 172.30.42.1 dev eth0
>> cache iif *
>>
>> So if the guest tries to send a frame to 8.8.8.8 we are going to forward
>> the packet to eth0
>>
>> But if the guest tries to send to 255.255.255.255, we try to deliver the
>> packet to the host itself, instead of broadcasting to eth0
>>
>> # ip ro get 255.255.255.255 from 192.168.200.1 iif tap1
>> broadcast 255.255.255.255 from 192.168.200.1 dev lo
>> cache <local,brd> iif *
>>
>> David, maybe you'll have an idea ?
>
> Perhaps this was introduced by:
Thanks, David.
Unfortunately, reversing that patch does not fix the problem. The pings
from the KVM client to the router still time out.
I have bisected this (see
http://marc.info/?l=linux-netdev&m=134797809611847&w=2) and that rendered:
$ git bisect bad
d2d68ba9fe8b38eb03124b3176a013bb8aa2b5e5 is the first bad commit
commit d2d68ba9fe8b38eb03124b3176a013bb8aa2b5e5
Author: David S. Miller <davem@...emloft.net>
Date: Tue Jul 17 12:58:50 2012 -0700
ipv4: Cache input routes in fib_info nexthops.
Caching input routes is slightly simpler than output routes, since we
don't need to be concerned with nexthop exceptions. (locally
destined, and routed packets, never trigger PMTU events or redirects
that will be processed by us).
However, we have to elide caching for the DIRECTSRC and non-zero itag
cases.
Signed-off-by: David S. Miller <davem@...emloft.net>
:040000 040000 6bbc75c1cbe62bf84ea412d3b98adf2b614779cd
3ad7256b4a71e63ca4530977c0550121ea803d35 M include
:040000 040000 18c2a950a53c4eec9bfa12185d1e382dfed74af8
a2ab6157d6cd54930da395758c6ded3a225d1f04 M net
Unfortunately, the related patches don't reverse cleanly, but a kernel
built from a git checkout of the parent commit (
f2bb4bedf35d5167a073dcdddf16543f351ef3ae) works fine.
>
> commit 7bd86cc282a458b66c41e3f6676de6656c99b8db
> Author: Yan, Zheng <zheng.z.yan@...el.com>
> Date: Sun Aug 12 20:09:59 2012 +0000
>
> ipv4: Cache local output routes
>
> Commit caacf05e5ad1abf causes big drop of UDP loop back performance.
> The cause of the regression is that we do not cache the local output
> routes. Each time we send a datagram from unconnected UDP socket,
> the kernel allocates a dst_entry and adds it to the rt_uncached_list.
> It creates lock contention on the rt_uncached_lock.
>
> Reported-by: Alex Shi <alex.shi@...el.com>
> Signed-off-by: Yan, Zheng <zheng.z.yan@...el.com>
> Signed-off-by: David S. Miller <davem@...emloft.net>
>
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index e4ba974..fd9ecb5 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -2028,7 +2028,6 @@ struct rtable *__ip_route_output_key(struct net *net, struct flowi4 *fl4)
> }
> dev_out = net->loopback_dev;
> fl4->flowi4_oif = dev_out->ifindex;
> - res.fi = NULL;
> flags |= RTCF_LOCAL;
> goto make_route;
> }
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists