netdev - RE: e1000 full-duplex TCP performance well below wire speed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <36D9DB17C6DE9E40B059440DB8D95F52044F8BA3@orsmsx418.amr.corp.intel.com>
Date:	Wed, 30 Jan 2008 21:43:01 -0800
From:	"Brandeburg, Jesse" <jesse.brandeburg@...el.com>
To:	"Bruce Allen" <ballen@...vity.phys.uwm.edu>
Cc:	<netdev@...r.kernel.org>,
	"Carsten Aulbert" <carsten.aulbert@....mpg.de>,
	"Henning Fehrmann" <henning.fehrmann@....mpg.de>,
	"Bruce Allen" <bruce.allen@....mpg.de>
Subject: RE: e1000 full-duplex TCP performance well below wire speed

Bruce Allen wrote:
> Hi Jesse,
> 
> It's good to be talking directly to one of the e1000 developers and
> maintainers.  Although at this point I am starting to think that the
> issue may be TCP stack related and nothing to do with the NIC.  Am I
> correct that these are quite distinct parts of the kernel?

Yes, quite.
 
> Important note: we ARE able to get full duplex wire speed (over 900
> Mb/s simulaneously in both directions) using UDP.  The problems occur
> only with TCP connections.

That eliminates bus bandwidth issues, probably, but small packets take
up a lot of extra descriptors, bus bandwidth, CPU, and cache resources.
 
>>> The test was done with various mtu sizes ranging from 1500 to 9000,
>>> with ethernet flow control switched on and off, and using reno and
>>> cubic as a TCP congestion control.
>> 
>> As asked in LKML thread, please post the exact netperf command used
>> to start the client/server, whether or not you're using irqbalanced
>> (aka irqbalance) and what cat /proc/interrupts looks like (you ARE
>> using MSI, right?)
> 
> I have to wait until Carsten or Henning wake up tomorrow (now 23:38 in
> Germany).  So we'll provide this info in ~10 hours.

I would suggest you try TCP_RR with a command line something like this:
netperf -t TCP_RR -H <hostname> -C -c -- -b 4 -r 64K

I think you'll have to compile netperf with burst mode support enabled.

> I assume that the interrupt load is distributed among all four cores
> -- the default affinity is 0xff, and I also assume that there is some
> type of interrupt aggregation taking place in the driver.  If the
> CPUs were not able to service the interrupts fast enough, I assume
> that we would also see loss of performance with UDP testing.
> 
>> One other thing you can try with e1000 is disabling the dynamic
>> interrupt moderation by loading the driver with
>> InterruptThrottleRate=8000,8000,... (the number of commas depends on
>> your number of ports) which might help in your particular benchmark.
> 
> OK.  Is 'dynamic interrupt moderation' another name for 'interrupt
> aggregation'?  Meaning that if more than one interrupt is generated
> in a given time interval, then they are replaced by a single
> interrupt? 

Yes, InterruptThrottleRate=8000 means there will be no more than 8000
ints/second from that adapter, and if interrupts are generated faster
than that they are "aggregated."

Interestingly since you are interested in ultra low latency, and may be
willing to give up some cpu for it during bulk transfers you should try
InterruptThrottleRate=1 (can generate up to 70000 ints/s)

>> just for completeness can you post the dump of ethtool -e eth0 and
>> lspci -vvv?
> 
> Yup, we'll give that info also.
> 
> Thanks again!

Welcome, its an interesting discussion.  Hope we can come to a good
conclusion.

Jesse
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html