netdev - Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160125231016.4f0d2cd5@redhat.com>
Date:	Mon, 25 Jan 2016 23:10:16 +0100
From:	Jesper Dangaard Brouer <brouer@...hat.com>
To:	John Fastabend <john.fastabend@...il.com>
Cc:	Tom Herbert <tom@...bertland.com>,
	"Michael S. Tsirkin" <mst@...hat.com>,
	David Miller <davem@...emloft.net>,
	Eric Dumazet <eric.dumazet@...il.com>,
	Or Gerlitz <gerlitz.or@...il.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Linux Kernel Network Developers <netdev@...r.kernel.org>,
	Alexander Duyck <alexander.duyck@...il.com>,
	Alexei Starovoitov <alexei.starovoitov@...il.com>,
	Daniel Borkmann <borkmann@...earbox.net>,
	Marek Majkowski <marek@...udflare.com>,
	Hannes Frederic Sowa <hannes@...essinduktion.org>,
	Florian Westphal <fw@...len.de>,
	Paolo Abeni <pabeni@...hat.com>,
	John Fastabend <john.r.fastabend@...el.com>,
	Amir Vadai <amirva@...il.com>,
	Daniel Borkmann <daniel@...earbox.net>,
	Vladislav Yasevich <vyasevich@...il.com>, brouer@...hat.com
Subject: Re: Bypass at packet-page level (Was: Optimizing instruction-cache,
 more packets at each stage)

On Mon, 25 Jan 2016 09:50:16 -0800 John Fastabend <john.fastabend@...il.com> wrote:

> On 16-01-25 09:09 AM, Tom Herbert wrote:
> > On Mon, Jan 25, 2016 at 5:15 AM, Jesper Dangaard Brouer
> > <brouer@...hat.com> wrote:  
> >>
[...]
> >>
> >> There are two ideas, getting mixed up here.  (1) bundling from the
> >> RX-ring, (2) allowing to pick up the "packet-page" directly.
> >>
> >> Bundling (1) is something that seems natural, and which help us
> >> amortize the cost between layers (and utilizes icache better). Lets
> >> keep that in another thread.
> >>
> >> This (2) direct forward of "packet-pages" is a fairly extreme idea,
> >> BUT it have the potential of being an new integration point for
> >> "selective" bypass-solutions and bringing RAW/af_packet (RX) up-to
> >> speed with bypass-solutions.
>
[...]
> 
> Jesper, at least for you (2) case what are we missing with the
> bifurcated/queue splitting work? Are you really after systems
> without SR-IOV support or are you trying to get this on the order
> of queues instead of VFs.

I'm not saying something is missing for bifurcated/queue splitting work.
I'm not trying to work-around SR-IOV.

This an extreme idea, which I got while looking at the lowest RX layer.

Before working any further on this idea/path, I need/want to evaluate
if it makes sense from a performance point of view.  I need to evaluate
if "pulling" out these "packet-pages" is fast enough to compete with
DPDK/netmap.  Else it makes no sense to work on this path.

As a first step to evaluate this lowest RX layer, I'm simply hacking
the drivers (ixgbe and mlx5) to drop/discard packets within-the-driver.
For now, simply replacing napi_gro_receive() with dev_kfree_skb(), and
measuring the "RX-drop" performance.

Next step was to avoid the skb alloc+free calls, but doing so is more
complicated that I first anticipated, as the SKB is tied in fairly
heavily.  Thus, right now I'm instead hooking in my bulk alloc+free
API, as that will remove/mitigate most of the overhead of the
kmem_cache/slab-allocators.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer