lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20100210182404.GB6160@ti94.telemetry-investments.com>
Date:	Wed, 10 Feb 2010 13:24:04 -0500
From:	Kelvin Ku <kelvin@...emetry-investments.com>
To:	e1000-devel@...ts.sourceforge.net
Cc:	netdev@...r.kernel.org, users@...ts.fedoraproject.org
Subject: RX performance degradation with e1000e in Linux 2.6.31 / F12

After upgrading from Linux 2.6.30 (Fedora Core 11) to 2.6.31 (F12), I am
experiencing significant packet loss on an Intel 82574L NIC running on the
e1000e driver. I was not experiencing this with kernel 2.6.30. I notice 2.6.30
uses e1000e version 0.3.3.4-k4 whereas 2.6.31 uses version 1.0.2-k2.

I have tried setting IntMode to 0, 1, and 2 and InterruptThrottleRate to 0, 1,
3 (the default), 1000, 5000, 10000, and 100000. I've also tried booting with
the "noapic" kernel parameter.

I am testing with ttcp, sending 100000 1450 byte UDP packets at about 910 Mbps.
With InterruptThrottleRate at 1, 3, 5000, or 10000, I see the following
behaviour on the receiver side:

        ttcp -u -4 -l 1450 -s -fm -r
ttcp-r: buflen=1450, nbuf=2048, align=16384/0, port=5001  udp
ttcp-r: socket
ttcp-r: 98486900 bytes in 1.22 real seconds = 617.22 Mbit/sec +++
ttcp-r: 67924 I/O calls, msec/call = 0.02, calls/sec = 55794.64
ttcp-r: 0.0user 0.0sys 0:01real 0% 0i+0d 0maxrss 0+0pf 4963+3csw

So in total (145000000 - 98486900)/1450 = 32078 out of 100000 packets were
dropped, or about 32%.

This is the difference between /proc/interrupts (the change in each counter)
before and after the test. lan0 is the interface being tested. Notice that
there are a significant number of interrupts on the "sequence error" interrupt;
I'm guessing that's 57:

 55:          0          0          0       8603 PCI-MSI-edge 
 56:          0          0          0         25 PCI-MSI-edge Q�����V 
 57:       4868          0          0          0 PCI-MSI-edge lan0 
 67:          0          0          2          0 PCI-MSI-edge ��������@�
 68:          0          0          0          0 PCI-MSI-edge 
 69:          0          0          0          0 PCI-MSI-edge lan1 

This is the difference between the output from 'ethtool -S lan0' before and
after the test; only fields which changed are shown:

: rx_broadcast:  585730 - 581046 = 4684
: rx_bytes:  931068459 - 822452567 = 108615892
: rx_csum_offload_good:  650149 - 577551 = 72598
: rx_long_byte_count:  931068459 - 822452567 = 108615892
: rx_missed_errors:  31003 - 6 = 30997
: rx_packets:  655692 - 583072 = 72620
: rx_smbus:  5784 - 5763 = 21
: tx_broadcast:  972 - 969 = 3
: tx_bytes:  388453 - 385439 = 3014
: tx_packets:  3025 - 3012 = 13

Notice the large rx_missed_errors count which indicates NIC FIFO or PCI bus
exhaustion.

If I disable interrupt throttling or set the limit very high, e.g., 100000, the
same test generates about 65,000 data interrupts and 93,000 error interrupts
and rx_missed_errors increases by 34,000. This suggests to me that the NIC is
attempting to raise an interrupt for every packet received.

An Intel 82576 NIC in the same system, running on the igb driver, is performing
OK under 2.6.31 (0 to 0.1% packet loss). For comparison, the same UDP test
generates about 6000 interrupts on the 82576.

dmesg, dmidecode, ethtool, lspci, 'netstat -s', and /proc/interrupts output is
attached.

N.B. I tried removing the 82576 NIC from the system before testing as well; no
change.

- Kelvin

Download attachment "testhost.dmesg.gz" of type "application/x-gzip" (12360 bytes)

Download attachment "testhost.dmidecode.gz" of type "application/x-gzip" (4371 bytes)

Download attachment "testhost.ethtool.gz" of type "application/x-gzip" (1075 bytes)

Download attachment "testhost.lspci.gz" of type "application/x-gzip" (5162 bytes)

Download attachment "testhost.netstat.gz" of type "application/x-gzip" (679 bytes)

Download attachment "testhost.proc-interrupts.gz" of type "application/x-gzip" (660 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ