[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20100210182404.GB6160@ti94.telemetry-investments.com>
Date: Wed, 10 Feb 2010 13:24:04 -0500
From: Kelvin Ku <kelvin@...emetry-investments.com>
To: e1000-devel@...ts.sourceforge.net
Cc: netdev@...r.kernel.org, users@...ts.fedoraproject.org
Subject: RX performance degradation with e1000e in Linux 2.6.31 / F12
After upgrading from Linux 2.6.30 (Fedora Core 11) to 2.6.31 (F12), I am
experiencing significant packet loss on an Intel 82574L NIC running on the
e1000e driver. I was not experiencing this with kernel 2.6.30. I notice 2.6.30
uses e1000e version 0.3.3.4-k4 whereas 2.6.31 uses version 1.0.2-k2.
I have tried setting IntMode to 0, 1, and 2 and InterruptThrottleRate to 0, 1,
3 (the default), 1000, 5000, 10000, and 100000. I've also tried booting with
the "noapic" kernel parameter.
I am testing with ttcp, sending 100000 1450 byte UDP packets at about 910 Mbps.
With InterruptThrottleRate at 1, 3, 5000, or 10000, I see the following
behaviour on the receiver side:
ttcp -u -4 -l 1450 -s -fm -r
ttcp-r: buflen=1450, nbuf=2048, align=16384/0, port=5001 udp
ttcp-r: socket
ttcp-r: 98486900 bytes in 1.22 real seconds = 617.22 Mbit/sec +++
ttcp-r: 67924 I/O calls, msec/call = 0.02, calls/sec = 55794.64
ttcp-r: 0.0user 0.0sys 0:01real 0% 0i+0d 0maxrss 0+0pf 4963+3csw
So in total (145000000 - 98486900)/1450 = 32078 out of 100000 packets were
dropped, or about 32%.
This is the difference between /proc/interrupts (the change in each counter)
before and after the test. lan0 is the interface being tested. Notice that
there are a significant number of interrupts on the "sequence error" interrupt;
I'm guessing that's 57:
55: 0 0 0 8603 PCI-MSI-edge
56: 0 0 0 25 PCI-MSI-edge Q�����V
57: 4868 0 0 0 PCI-MSI-edge lan0
67: 0 0 2 0 PCI-MSI-edge ��������@�
68: 0 0 0 0 PCI-MSI-edge
69: 0 0 0 0 PCI-MSI-edge lan1
This is the difference between the output from 'ethtool -S lan0' before and
after the test; only fields which changed are shown:
: rx_broadcast: 585730 - 581046 = 4684
: rx_bytes: 931068459 - 822452567 = 108615892
: rx_csum_offload_good: 650149 - 577551 = 72598
: rx_long_byte_count: 931068459 - 822452567 = 108615892
: rx_missed_errors: 31003 - 6 = 30997
: rx_packets: 655692 - 583072 = 72620
: rx_smbus: 5784 - 5763 = 21
: tx_broadcast: 972 - 969 = 3
: tx_bytes: 388453 - 385439 = 3014
: tx_packets: 3025 - 3012 = 13
Notice the large rx_missed_errors count which indicates NIC FIFO or PCI bus
exhaustion.
If I disable interrupt throttling or set the limit very high, e.g., 100000, the
same test generates about 65,000 data interrupts and 93,000 error interrupts
and rx_missed_errors increases by 34,000. This suggests to me that the NIC is
attempting to raise an interrupt for every packet received.
An Intel 82576 NIC in the same system, running on the igb driver, is performing
OK under 2.6.31 (0 to 0.1% packet loss). For comparison, the same UDP test
generates about 6000 interrupts on the 82576.
dmesg, dmidecode, ethtool, lspci, 'netstat -s', and /proc/interrupts output is
attached.
N.B. I tried removing the 82576 NIC from the system before testing as well; no
change.
- Kelvin
Download attachment "testhost.dmesg.gz" of type "application/x-gzip" (12360 bytes)
Download attachment "testhost.dmidecode.gz" of type "application/x-gzip" (4371 bytes)
Download attachment "testhost.ethtool.gz" of type "application/x-gzip" (1075 bytes)
Download attachment "testhost.lspci.gz" of type "application/x-gzip" (5162 bytes)
Download attachment "testhost.netstat.gz" of type "application/x-gzip" (679 bytes)
Download attachment "testhost.proc-interrupts.gz" of type "application/x-gzip" (660 bytes)
Powered by blists - more mailing lists