[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1321857123.17419.2.camel@edumazet-laptop>
Date: Mon, 21 Nov 2011 07:32:03 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: netdev@...r.kernel.org
Subject: Re: Problems with dropped packets on bonded interface for 3.x
kernels
Le dimanche 20 novembre 2011 à 23:16 -0600, Albert Chin a écrit :
> I'm running Ubuntu 11.10 on an Intel SR2625URLXR system with an Intel
> S5520UR motherboard and an internal Intel E1G44HT (I340-T4) Quad Port
> Server Adapter. I am seeing dropped packets on a bonded interface,
> comprised of two GigE ports on the Intel E1G44HT Quad Port Server
> Adapter. The following kernels exhibit this problem:
> 3.0.0-12-server, 3.0.0-13-server, 3.1.0-2-server, 3.2.0-rc2
> Installing Fedora 16 with a 3.1.1-1.fc16.x86_64 also showed dropped
> packets.
>
> I also tried RHEL6 with a 2.6.32-131.17.1.el6.x86_64 kernel and didn't
> see any dropped packets. Testing an older 2.6.32-28.55-generic Ubuntu
> kernel also didn't show any dropped packets.
>
> So, with 2.6, I don't see dropped packets, but everything including
> 3.0 and after show dropped packets.
>
> # ifconfig bond0
> bond0 Link encap:Ethernet HWaddr 00:1b:21:d3:f6:0a
> inet6 addr: fe80::21b:21ff:fed3:f60a/64 Scope:Link
> UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
> RX packets:225 errors:0 dropped:186 overruns:0 frame:0
> TX packets:231 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:25450 (25.4 KB) TX bytes:28368 (28.3 KB)
>
> With lacp_rate=fast, I see higher packet loss than with
> lacp_rate=slow. I've tried bonding t
>
> This server has the following network controllers for the two internal
> NICs:
> # lspci -vv
> 01:00.0 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02)
> 01:00.1 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02)
>
> And it has the following network controllers for the four NICs on the
> I340-T4 PCI-E card:
> # lspci -vv
> 0a:00.0 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
> 0a:00.1 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
> 0a:00.2 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
> 0a:00.3 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
>
> I tried bonding the two 82575EB NICs rather than two NICs on the 82580
> but see the same dropped packet issue.
>
> I have replaced the cables, tested each port individually on the
> switch without bonding, and don't see any reason to expect hardware as
> the issue. The switch is a Summit Extreme 400-48t.
>
> I am using a 802.3ad configuration:
> # cat /proc/net/bonding/bond0
> Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
>
> Bonding Mode: IEEE 802.3ad Dynamic link aggregation
> Transmit Hash Policy: layer2 (0)
> MII Status: up
> MII Polling Interval (ms): 100
> Up Delay (ms): 200
> Down Delay (ms): 0
>
> 802.3ad info
> LACP rate: fast
> Aggregator selection policy (ad_select): stable
> Active Aggregator Info:
> Aggregator ID: 1
> Number of ports: 1
> Actor Key: 17
> Partner Key: 24
> Partner Mac Address: 00:04:96:18:54:d5
>
> Slave Interface: eth4
> MII Status: up
> Speed: 1000 Mbps
> Duplex: full
> Link Failure Count: 0
> Permanent HW addr: 00:1b:21:d3:f6:0a
> Aggregator ID: 1
> Slave queue ID: 0
>
> Slave Interface: eth5
> MII Status: up
> Speed: 1000 Mbps
> Duplex: full
> Link Failure Count: 0
> Permanent HW addr: 00:1b:21:d3:f6:0b
> Aggregator ID: 2
> Slave queue ID: 0
>
> Anyone have any ideas?
>
Old kernels were dropping some packets (unknown protocols...) without
counting them.
So following patch was added in 2.6.37 :
You could use tcdpump to identify what are these dropped packets :)
commit caf586e5f23cebb2a68cbaf288d59dbbf2d74052
Author: Eric Dumazet <eric.dumazet@...il.com>
Date: Thu Sep 30 21:06:55 2010 +0000
net: add a core netdev->rx_dropped counter
In various situations, a device provides a packet to our stack and we
drop it before it enters protocol stack :
- softnet backlog full (accounted in /proc/net/softnet_stat)
- bad vlan tag (not accounted)
- unknown/unregistered protocol (not accounted)
We can handle a per-device counter of such dropped frames at core level,
and automatically adds it to the device provided stats (rx_dropped), so
that standard tools can be used (ifconfig, ip link, cat /proc/net/dev)
This is a generalization of commit 8990f468a (net: rx_dropped
accounting), thus reverting it.
Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>
Signed-off-by: David S. Miller <davem@...emloft.net>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists