netdev - Re: [RFC PATCH net-next 1/3] ixgbe: support netdev_ops->ndo_xmit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <53FBBE06.3020405@intel.com>
Date:	Mon, 25 Aug 2014 15:51:50 -0700
From:	Alexander Duyck <alexander.h.duyck@...el.com>
To:	Jesper Dangaard Brouer <brouer@...hat.com>,
	Daniel Borkmann <dborkman@...hat.com>
CC:	davem@...emloft.net, netdev@...r.kernel.org
Subject: Re: [RFC PATCH net-next 1/3] ixgbe: support netdev_ops->ndo_xmit_flush()

On 08/25/2014 05:07 AM, Jesper Dangaard Brouer wrote:
> On Sun, 24 Aug 2014 15:42:16 +0200
> Daniel Borkmann <dborkman@...hat.com> wrote:
> 
>> This implements the deferred tail pointer flush API for the ixgbe
>> driver. Similar version also proposed longer time ago by Alexander Duyck.
> 
> I've run some benchmarks with this patch only, which actually shows a
> performance regression.
> 
> Using trafgen with QDISC_BYPASS and mmap mode, via cmdline:
>  trafgen --cpp  --dev eth5 --conf udp_example01.trafgen -V --cpus 1
> 
> BASELINE(no-patch): trafgen QDISC_BYPASS and mmap:
>  - tx:1562539 pps
> 
> (This patch only): ixgbe use of .ndo_xmit_flush.
>  - tx:1532299 pps
> 
> Regression: -30240 pps
>  * In nanosec: (1/1562539*10^9)-(1/1532299*10^9) = -12.63 ns
> 
> 
> As DaveM points out, me might not need the mmiowb().
> Result when not performing the mmiowb():
>  - tx:1548352 pps
> 
> Still a small regression: -14187 pps
>  * In nanosec: (1/1562539*10^9)-(1/1548352*10^9) = -5.86 ns
> 
> 
> I was not expecting this "slowdown", with this rather simple use of the
> new ndo_xmit_flush API.  Can anyone explain why this is happening?

One possibility is that we are now doing less stuff between the time we
write tail and when we grab the qdisc lock (locked transactions are
stalled by MMIO) so that we are spending more time stuck waiting for the
write to complete and doing nothing.

Then of course there are always the funny oddball quirks such as the
code changes might have changed the alignment of a loop resulting in Tx
cleanup more expensive than it was before.

Thanks,

Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html