netdev - Re: [PATCH 0/2] Get rid of ndo_xmit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 26 Aug 2014 14:52:25 +0200
From:	Jesper Dangaard Brouer <brouer@...hat.com>
To:	Jesper Dangaard Brouer <brouer@...hat.com>
Cc:	David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
	therbert@...gle.com, jhs@...atatu.com, hannes@...essinduktion.org,
	edumazet@...gle.com, jeffrey.t.kirsher@...el.com,
	rusty@...tcorp.com.au, dborkman@...hat.com
Subject: Re: [PATCH 0/2] Get rid of ndo_xmit_flush


On Tue, 26 Aug 2014 12:13:47 +0200 Jesper Dangaard Brouer <brouer@...hat.com> wrote:

> On Tue, 26 Aug 2014 08:28:15 +0200 Jesper Dangaard Brouer <brouer@...hat.com> wrote:
> > On Mon, 25 Aug 2014 16:34:58 -0700 (PDT) David Miller <davem@...emloft.net> wrote:
> > 
> > > Given Jesper's performance numbers, it's not the way to go.
> > > 
> > > Instead, go with a signalling scheme via new boolean skb->xmit_more.
> > 
> > I'll do benchmarking based on this new API proposal today.
> 
> While establish an accurate baseline for my measurements.  I'm
> starting to see too much variation in my trafgen measurements.
> Meaning that we unfortunately cannot use it to measure variations on
> the nanosec scale.

Thus, we need to find a better more accurate measurement tool than
trafgen/af_packet.

Changed my PPS monitor "ifpps-oneliner" to calculate the nanosec
variation between the instant reading and the average.  For TX also
record the "max" and "min" variation value seen.

This should give us a better (instant) picture of how accurate the
measurement is.

ifpps -clod eth5 -t 1000 | \
 awk 'BEGIN{txsum=0; rxsum=0; n=0; txvar=0; txvar_min=0; txvar_max=0; rxvar=0;} \
 /[[:digit:]]/ {txsum+=$11;rxsum+=$3;n++; \
   txvar=0; if (txsum/n>10 && $11>0) { \
     txvar=((1/(txsum/n)*10^9)-(1/$11*10^9)); \
     if (n>10 && txvar < txvar_min) {txvar_min=txvar}; \
     if (n>10 && txvar > txvar_max) {txvar_max=txvar}; \
   }; \
   rxvar=0; if (rxsum/n>10 && $3>0 ) { rxvar=((1/(rxsum/n)*10^9)-(1/$3*10^9))}; \
   printf "instant rx:%u tx:%u pps n:%u average: rx:%d tx:%d pps (instant variation TX %.3f ns (min:%.3f max:%.3f) RX %.3f ns)\n", $3, $11, n, rxsum/n, txsum/n, txvar, txvar_min, txvar_max, rxvar; \
   if (txvar > 2) {printf "WARNING instant variation high\n" } }'


Nanosec variation with trafgen:
-------------------------------

As can be seen, the min and max nanosec variation with trafgen is
higher than we would like:

Results: trafgen
 (sudo ethtool -C eth5 rx-usecs 1)
 instant rx:0 tx:1566064 pps n:152 average: rx:0 tx:1564534 pps
 (instant variation TX 0.624 ns (min:-6.336 max:1.766) RX 0.000 ns)

Results: trafgen
 (sudo ethtool -C eth5 rx-usecs 30)
 instant rx:0 tx:1576452 pps n:121 average: rx:0 tx:1575652 pps
 (instant variation TX 0.322 ns (min:-4.479 max:0.714) RX 0.000 ns)


Switching to pktgen
-------------------

I suspect a more accurate measurement tool will be "pktgen", because
we can cut out most of the things that can cause these variations
(like kmem_cache and cache-hot variations, and most sched variations).

The main problem with ixgbe is that, in this overload scenario, the
performance is limited by the TX ring size and cleanup intervals, as
described in:
 http://netoptimizer.blogspot.dk/2014/06/pktgen-for-network-overload-testing.html
 https://www.kernel.org/doc/Documentation/networking/pktgen.txt

Results below: Try to determine which ixgbe ethtool setting gives the
most stable PPS readings.  Notice the TX "min" and "max" nanosec
variations seen over the period.  Sampling over approx 120 sec.

The best setting seems to be:
 sudo ethtool -C eth5 rx-usecs 30
 sudo ethtool -G eth5 tx 512  #(default size)

Pktgen tests are single CPU performance numbers, script based on:
 https://github.com/netoptimizer/network-testing/blob/master/pktgen/example01.sh
 with CLONE_SKB="100000" (and single flow, const port number 9/discard)

Setting:
 sudo ethtool -G eth5 tx 512 #(Default setting)
 sudo ethtool -C eth5 rx-usecs 1 #(Default setting)
Result pktgen:
 * instant rx:1 tx:3933892 pps n:120 average: rx:1 tx:3934182 pps
   (instant variation TX -0.019 ns (min:-0.047 max:0.016) RX 0.000 ns)

The variation very small, but the performance is limited by the TX
ring buffer being full most of the time, TX cleanup being too slow.

Setting: (inc TX ring size)
 sudo ethtool -G eth5 tx 1024
 sudo ethtool -C eth5 rx-usecs 1 #(default setting)
Result pktgen:
 * instant rx:1 tx:5745632 pps n:118 average: rx:1 tx:5748818 pps
   (instant variation TX -0.096 ns (min:-0.293 max:0.897) RX 0.000 ns)

Setting:
 sudo ethtool -G eth5 tx 512
 sudo ethtool -C eth5 rx-usecs 20
Result pktgen:
 * instant rx:1 tx:5765168 pps n:120 average: rx:0 tx:5782242 pps
   (instant variation TX -0.512 ns (min:-1.008 max:1.599) RX 0.000 ns)

Setting:
 sudo ethtool -G eth5 tx 512
 sudo ethtool -C eth5 rx-usecs 30
Result pktgen:
 * instant rx:1 tx:5920856 pps n:114 average: rx:1 tx:5918350 pps
   (instant variation TX 0.071 ns (min:-0.177 max:0.135) RX 0.000 ns)

Setting:
 sudo ethtool -G eth5 tx 512
 sudo ethtool -C eth5 rx-usecs 40
Result pktgen:
 * instant rx:1 tx:5958408 pps n:120 average: rx:0 tx:5947908 pps
   (instant variation TX 0.296 ns (min:-1.410 max:0.595) RX 0.000 ns)

Setting:
 sudo ethtool -G eth5 tx 512
 sudo ethtool -C eth5 rx-usecs 50
Result pktgen:
 * instant rx:1 tx:5966964 pps n:120 average: rx:1 tx:5967306 pps
   (instant variation TX -0.010 ns (min:-1.330 max:0.169) RX 0.000 ns)

Setting:
 sudo ethtool -C eth5 rx-usecs 30
 sudo ethtool -G eth5 tx 1024
Result pktgen:
 instant rx:0 tx:5846252 pps n:120 average: rx:1 tx:5852464 pps
 (instant variation TX -0.182 ns (min:-0.467 max:2.249) RX 0.000 ns)


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html