[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <3244031.uQGDddGTLF@h2o.as.studentenwerk.mhn.de>
Date: Tue, 01 Oct 2013 18:39:32 +0200
From: Wolfgang Walter <linux@...m.de>
To: netdev@...r.kernel.org
Subject: Big performance loss from 3.4.63 to 3.10.13 when routing ipv4
Hello,
I tried to upgrade one of our routers to 3.10.13 from 3.4.63 and I see a
dramatic performance loss. I tried 3.11.2 and it is still there.
*** Symptoms:
All network traffic over the router become slow and sluggish. If one pings the
router there is a packet loss. After about 2 minutes the traffic completely
stalls for about 1 minute. Then it works again as in the beginning to then
stall again. And so on.
This happens even with rather moderate traffic. While still routing the CPU
utilization is higher than it is with 3.4.63 but only moderately.
When it stalls no network traffic seems possible (but to loopback). If one
tries to ping from the router any target (even if it is on a interface with no
traffic at all) one gets:
ping: sendmsg: No buffer space available
As the router has about 15G free memory this probably means that an internal
table is full.
The CPU-utilization is low within that period.
I can trigger it easily when I copy about 50 big files per scp over 50
different ipsec-tunnels:
* boot router
* wait until all ipsec tunnels are established
* start copying:
H <--1G--> Router <---1G--->.......<-- >=100MBit --> Xn <---100Mbit----> Rn
So there is a ipsec tunnel between Router and Xn for all n=1 to 50. I copy
files from Rn to H. I start the copy from H, so the tcp-connections get
established from H to Rn.
The same test works just fine with 3.4.63. All cores are used but no one
reaches its limit. The router does neither drop pings nor does it have
problems pinging other targets.
I tested 3.8.13 It seems not to have this issue if I increase
net.ipv4.inet_peer_threshold
(I tried 6566400, didn't try smaller values beside the default one).
If I use the default one 3.8.13 behaves badly.
But 3.8.13 seems to have other issues. Basically: routing stalls later much
longer (up to 6 minutes or so).
*** Environment:
It's a 8 core machine (with AES-NI). It establishes a lot of ipsec-tunnels. It
uses statefull packet filtering (but no NAT). The network-cards are intel
cards (driver: igb and ixgbe). No IPv6. No ethernet flow control enabled (but
doesn't matter). No traffic shaping (that is tc). igb/ixgbe interfaces:
nothing modified with ethtool but flow control (autoneg off tx off rx off).
Any idea?
Regards,
--
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists