[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1401885281.3645.245.camel@edumazet-glaptop2.roam.corp.google.com>
Date: Wed, 04 Jun 2014 05:34:41 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Suprasad Mutalik Desai <suprasad.desai@...il.com>
Cc: netdev@...r.kernel.org, davem@...emloft.ne
Subject: Re: Fwd: Linux stack performance drop (TCP and UDP) in 3.10 kernel
in routed scenario
On Wed, 2014-06-04 at 14:34 +0530, Suprasad Mutalik Desai wrote:
> Hi,
>
>
> Currently i am working on 3.10.12 kernel and it seems the Linux
> stack performance (TCP and UDP) has degraded drastically as compared
> to 2.6 kernel.
>
> Results :
>
> Linux 2.6.32
> ---------------------
> TCP traffic using iperf
> - Upstream : 140 Mbps
> - Downstream : 148 Mbps
>
> UDP traffic using iperf
> - Upstream : 200 Mbps
> - Downstream : 245 Mbps
>
> Linux 3.10.12
> --------------------
> TCP traffic using iperf
> - Upstream : 101 Mbps
> - Downstream : 106 Mbps
>
> UDP traffic using iperf
> - Upstream : 140 Mbps
> - Downstream : 170 Mbps
>
> Analysis:
> ---------------
> 1. As per profiling data on Linux-3.10.12 it seems,
> - fib_table_lookup and ip_route_input_noref is being
> called most of the times and thus causing the degradation in
> performance.
>
> 8.77 csum_partial 0x80009A20 1404
Main problem here is lack of checksums. What kind of NIC is used ?
> 4.53 ipt_do_table 0x80365C34 1352
> 3.45 eth_xmit 0x870D0C88 5460
> 3.41 fib_table_lookup 0x8035240C 856 <----------
> 3.38 __netif_receive_skb_core 0x802B5C00 2276
> 3.07 dma_device_write 0x80013BD4 752
> 2.94 nf_iterate 0x802EA380 256
> 2.69 ip_route_input_noref 0x8030CE14 2520 <--------------
> 2.24 ip_forward 0x8031108C 1040
> 2.04 tcp_packet 0x802F45BC 3956
> 1.93 nf_conntrack_in 0x802EEAF4 2284
>
> 2. Based on the above observation, when searched, it seems Routing
> cache code has been removed from Linux-3.6 kernel and thus every
> packet has to go through ip_route_input_noref to find the destination.
>
> 3. Related to this, a patch from David Miller adds "ipv4: Early TCP
> socket demux" which caches the "dst per socket" and maintains
> tcp_hashinfo and uses early_demux(skb) (TCP --> tcp_v4_early_demux and
> UDP --> NULL i.e not defined) to get the "dst" of that skb and thus
> avoids ip_route_input_noref being called everytime.
> - But this still doesn’t handle routing scenarios (LAN <--> WAN).
>
> 4. A patch for UDP early demux has been added in Linux 3.13 and
> certain bugfixes has gone in Linux-3.14 .
>
> 5. As we are based on 3.10 thus no UDP early_demux support . This
> means we have to backport the UDP early demux patch to 3.10 kernel .
Nope : This will be of no use on a router. It even will slow down the
router.
>
>
> Issue :
> -----------
>
> 1. The implementation of "Early TCP socket demux" doesn't address
> the routing scenario (LAN <---> WAN) . This means TCP and UDP routing
> performance will be less in 3.10 kernel and also in 3.14 kernel as
> every packet has to go through route lookup.
>
>
> Is there an alternative to get back the Linux stack performance of 2.6
> or 3.4 kernel where we have the route cache ?
>
> I guess plain routing scenario was NOT thought through while removing
> the routing cache code.
This is the opposite. Route cache was easily targeted by DDOS attacks.
This was a nightmare.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists