lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 02 May 2012 10:16:23 -0700
From:	Rick Jones <rick.jones2@...com>
To:	Eric Dumazet <eric.dumazet@...il.com>
CC:	Alexander Duyck <alexander.h.duyck@...el.com>,
	Alexander Duyck <alexander.duyck@...il.com>,
	David Miller <davem@...emloft.net>,
	netdev <netdev@...r.kernel.org>,
	Neal Cardwell <ncardwell@...gle.com>,
	Tom Herbert <therbert@...gle.com>,
	Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
	Michael Chan <mchan@...adcom.com>,
	Matt Carlson <mcarlson@...adcom.com>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	Ben Hutchings <bhutchings@...arflare.com>,
	Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>,
	Maciej Żenczykowski <maze@...gle.com>
Subject: Re: [PATCH 3/4 v2 net-next] net: make GRO aware of skb->head_frag

On 05/02/2012 01:24 AM, Eric Dumazet wrote:
> On Tue, 2012-05-01 at 12:45 -0700, Alexander Duyck wrote:
>
>> I have a hacked together ixgbe up and running now with the new build_skb
>> logic and RSC/LRO disabled.  It looks like it is giving me a 5%
>> performance boost for small packet routing, but I am using more CPU for
>> netperf TCP receive tests and I was wondering if you had seen anything
>> similar on the tg3 driver?
>
> Really hard to say, numbers are so small on Gb link :
>
> what do you use to make your numbers ?
>
> netperf -H 172.30.42.23 -t OMNI -C -c
> OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.30.42.23 (172.30.42.23) port 0 AF_INET
> Local       Local       Local  Elapsed Throughput Throughput  Local Local  Remote Remote Local   Remote  Service
> Send Socket Send Socket Send   Time               Units       CPU   CPU    CPU    CPU    Service Service Demand
> Size        Size        Size   (sec)                          Util  Util   Util   Util   Demand  Demand  Units
> Final       Final                                             %     Method %      Method
> 1700840     1700840     16384  10.01   931.60     10^6bits/s  4.50  S      1.32   S      1.582   2.783   usec/KB

If there is so little CPU consumed, I'm a bit surprised the throughput 
wasn't 940 Gbit/s.

It might be a good idea to fix the local and remote socket buffer sizes 
for these sorts of A-B comparisons to take the variability of the 
autotuning out.

And then, to see if the small differences are "real" one can light-up 
the confidence intervals.  For example (using kernels unrelated to the 
patch discussion):

raj@...dy:~/netperf2_trunk/src$ ./netperf -H 192.168.1.3 -t omni -c -C 
-I 99,1 -i 30,3 -- -s 256K -S 256K -m 16K -O 
throughput,local_cpu_util,local_sd,remote_cpu_util,remote_sd,throughput_confid,local_cpu_confid,remote_cpu_confid,confidence_iteration
OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.3 () 
port 0 AF_INET : +/-0.500% @ 99% conf.  : interval : demo
Throughput Local Local   Remote Remote  Throughput Local      Remote 
  Confidence
            CPU   Service CPU    Service Confidence CPU        CPU 
   Iterations
            Util  Demand  Util   Demand  Width (%)  Confidence 
Confidence Run
            %             %                         Width (%)  Width (%) 

941.36     8.70  3.030   45.36  7.895   0.006      18.836     0.209 
  30

In this instance, I asked to be 99% confident the throughput and CPU 
util were within +/- 0.5% of the "real" mean.  The confidence intervals 
were hit for throughput and remote CPU util, but not for local CPU util 
- netperf was running on my personal workstation, which also receives 
email etc.  Presumably a more isolated and idle system would have hit 
the confidence intervals.

Other sources of variation to consider eliminating when looking for 
small differences in CPU utilization might be the multiqueue support in 
the NIC.  I'll often just terminate irqbalance and set all the IRQs to a 
single CPU (when doing single stream tests).  Or, one can fully specify 
the four-tuple for the netperf data connection.

rick jones
of course there is also the whole question of the effect of HW threading 
on the meaningfulness of OS-determined utilization...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ