lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ3xEMiUbjSM4-EZkm-_b+EtAS4NZOA+Whcv3V3=UT+CwmOvQA@mail.gmail.com>
Date:   Thu, 12 Jul 2018 23:10:28 +0300
From:   Or Gerlitz <gerlitz.or@...il.com>
To:     Jesper Dangaard Brouer <brouer@...hat.com>,
        Edward Cree <ecree@...arflare.com>
Cc:     Saeed Mahameed <saeedm@...lanox.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [net-next PATCH] net: ipv4: fix listify ip_rcv_finish in case of forwarding

On Wed, Jul 11, 2018 at 11:06 PM, Jesper Dangaard Brouer
<brouer@...hat.com> wrote:

> Well, I would prefer you to implement those.  I just did a quick
> implementation (its trivially easy) so I have something to benchmark
> with.  The performance boost is quite impressive!

sounds good, but wait


> One reason I didn't "just" send a patch, is that Edward so-fare only
> implemented netif_receive_skb_list() and not napi_gro_receive_list().

sfc does't support gro?! doesn't make sense.. Edward?

> And your driver uses napi_gro_receive().  This sort-of disables GRO for
> your driver, which is not a choice I can make.  Interestingly I get
> around the same netperf TCP_STREAM performance.

Same TCP performance

with GRO and no rx-batching

or

without GRO and yes rx-batching

is by far not intuitive result to me unless both these techniques
mostly serve to eliminate lots of instruction cache misses and the
TCP stack is so much optimized that if the code is in the cache,
going through it once with 64K byte GRO-ed packet is like going
through it ~40 (64K/1500) times with non GRO-ed packets.

What's the baseline (with GRO and no rx-batching) number on your setup?

> I assume we can get even better perf if we "listify" napi_gro_receive.

yeah, that would be very interesting to get there

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ