[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141127130057.5403429c@redhat.com>
Date: Thu, 27 Nov 2014 13:00:57 +0100
From: Jesper Dangaard Brouer <brouer@...hat.com>
To: Alexander Duyck <alexander.h.duyck@...hat.com>
Cc: netdev@...r.kernel.org, davem@...emloft.net,
jeffrey.t.kirsher@...el.com, eric.dumazet@...il.com,
ast@...mgrid.com, brouer@...hat.com
Subject: Re: [RFC PATCH 0/3] net: Alloc NAPI page frags from their own pool
On Wed, 26 Nov 2014 16:05:50 -0800
Alexander Duyck <alexander.h.duyck@...hat.com> wrote:
> This patch series implements a means of allocating page fragments without
> the need for the local_irq_save/restore in __netdev_alloc_frag. By doing
> this I am able to decrease packet processing time by 11ns per packet in my
> test environment.
This is really good work!
I've tested the patchset (detail see below). Two different packet
sizes 64bytes and 272bytes, due to "copy-break" point in driver.
Notice, these tests are single flow, resulting in single CPU getting
activated on receiver.
If I drop packets very early in iptables "raw" table, I see an
improvement 10.51 ns to 13.22 ns (for 272bytes between 9.64 ns to 11.97
ns). Which corrospond with Alex'es observations.
A little surprising, when doing full forwarding (IP-routing), I see a
much larger "nanosec" improvement, for 64bytes of between 47.64ns to
58.15ns (for 272bytes between 29.08ns to 30.14ns). This improvement is
larger than I expected. One pitfall is with full forwarding, we can
only forwards approx 1Mpps (single CPU), and the accuracy between tests
runs vary more.
Setup
-----
Generator: ixgbe, pktgen (3x CPUs), sending 10G wirespeed
- Single flow pktgen, resulting in single CPU activation on target
- pkt@...ytes: tx:14900856 pps (wirespeed)
- pkt@...bytes: tx: 4228696 pps (wirespeed)
Ethernet wirespeed:
* (1/((64+20)*8))*(10*10^9) = 14880952
* (1/((272+20)*8))*(10*10^9) = 4280822
Receiver CPU E5-2695 running state-c0@...GHz
baseline
--------
Baseline: Full forwarding (no-netfilter):
* pkt@...ytes: tx:977414 pps
* pkt@...ytes: tx:974404 pps
* test-variation@...ytes: 3010pps (1/977414*10^9)-(1/974404*10^9) = -3.16ns
* pkt@...bytes: tx:911657 pps
* pkt@...bytes: tx:906229 pps
* test-variation@...bytes: 5428pps -6.57ns
Baseline: Drop in iptables RAW:
* pkt@...ytes: rx:2801058 pps
* pkt@...ytes: rx:2785579 pps
* test-variation@...ytes: 15479pps -1.98 ns
* pkt@...bytes: rx:2559718 pps
* pkt@...bytes: rx:2544577 pps
* test-variation@...ytes diff: 6230pps 0.746ns
With patch: alex'es napi_alloc_skb
----------------------------------
Full forwarding (no-netfilter) (pkt@...ytes):
* pkt@...ytes: tx:1025150 pps
* pkt@...ytes: tx:1032930 pps
* test-variation@...ytes: -7780pps 7.34ns
* Patchset improvements@...fwd:
- 977414 -> 1025150 = 47736pps -> 47.64ns
- 974404 -> 1032930 = 58526pps -> 58.15ns
* pkt@...bytes: tx:937416 pps
* pkt@...bytes: tx:930761 pps
* test-variation@...bytes: 6655pps -7.62ns
* Patchset improvements@...-fwd:
- 911657 -> 937416 = 25759pps -> 30.14ns
- 906229 -> 930761 = 24532pps -> 29.08ns
Drop in iptables RAW (pkt@...ytes):
* pkt@...ytes: rx:2885820 pps
* pkt@...ytes: rx:2892050 pps
* test-variation@...ytes diff: 6230pps 0.746ns
* Patchset improvements@...drop:
- 2800896 -> 2885820 = 84924pps -> 10.51 ns
- 2785579 -> 2892050 = 106471pps -> 13.22 ns
* pkt@...bytes: rx:2624484 pps
* pkt@...bytes: rx:2624492 pps
* test-variation: pkt@...bytes diff: 8pps 0ns
* Patchset improvements@...-drop:
- 2624484 -> 2559718 = 64766 pps -> 9.64 ns
- 2624492 -> 2544577 = 79915 pps -> 11.97 ns
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Sr. Network Kernel Developer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists