netdev - RE: e1000 performance issue in 4 simultaneous links

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1200068444.9349.20.camel@cafe>
Date:	Fri, 11 Jan 2008 14:20:44 -0200
From:	Breno Leitao <leitao@...ux.vnet.ibm.com>
To:	"Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
	rick.jones2@...com
Cc:	netdev@...r.kernel.org
Subject: RE: e1000 performance issue in 4 simultaneous links

On Thu, 2008-01-10 at 12:52 -0800, Brandeburg, Jesse wrote:
> Breno Leitao wrote:
> > When I run netperf in just one interface, I get 940.95 * 10^6 bits/sec
> > of transfer rate. If I run 4 netperf against 4 different interfaces, I
> > get around 720 * 10^6 bits/sec.
> 
> I hope this explanation makes sense, but what it comes down to is that
> combining hardware round robin balancing with NAPI is a BAD IDEA.  In
> general the behavior of hardware round robin balancing is bad and I'm
> sure it is causing all sorts of other performance issues that you may
> not even be aware of.
I've made another test removing the ppc IRQ Round Robin scheme, bonded
each interface (eth6, eth7, eth16 and eth17) to different CPUs (CPU1,
CPU2, CPU3 and CPU4) and I also get around around 720 * 10^6 bits/s in
average.

Take a look at the interrupt table this time: 

io-dolphins:~/leitao # cat /proc/interrupts  | grep eth[1]*[67]
277:         15    1362450         13         14         13         14         15         18   XICS      Level     eth6
278:         12         13    1348681         19         13         15         10         11   XICS      Level     eth7
323:         11         18         17    1348426         18         11         11         13   XICS      Level     eth16
324:         12         16         11         19    1402709         13         14         11   XICS      Level     eth17


I also tried to bound all the 4 interface IRQ to a single CPU (CPU0)
using the noirqdistrib boot paramenter, and the performance was a little
worse.

Rick, 
  The 2 interface test that I showed in my first email, was run in two
different NIC. Also, I am running netperf with the following command
"netperf -H <hostname> -T 0,8" while netserver is running without any
argument at all. Also, running vmstat in parallel shows that there is no
bottleneck in the CPU. Take a look: 

procs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2  0      0 6714732  16168 227440    0    0     8     2  203   21  0  1 98  0  0
 0  0      0 6715120  16176 227440    0    0     0    28 16234  505  0 16 83  0  1
 0  0      0 6715516  16176 227440    0    0     0     0 16251  518  0 16 83  0  1
 1  0      0 6715252  16176 227440    0    0     0     1 16316  497  0 15 84  0  1
 0  0      0 6716092  16176 227440    0    0     0     0 16300  520  0 16 83  0  1
 0  0      0 6716320  16180 227440    0    0     0     1 16354  486  0 15 84  0  1
 

Thanks!

-- 
Breno Leitao <leitao@...ux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html