lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 04 Jun 2018 13:29:01 -0700
From:   Jeff Kirsher <jeffrey.t.kirsher@...el.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Björn Töpel <bjorn.topel@...il.com>
Cc:     mykyta.iziumtsev@...aro.org, mst@...hat.com,
        brian.brooks@...aro.org, magnus.karlsson@...il.com,
        andy@...yhouse.net, francois.ozog@...aro.org,
        willemdebruijn.kernel@...il.com, daniel@...earbox.net, ast@...com,
        intel-wired-lan@...ts.osuosl.org, brouer@...hat.com,
        Björn Töpel <bjorn.topel@...el.com>,
        michael.lundkvist@...csson.com, qi.z.zhang@...el.com,
        michael.chan@...adcom.com, magnus.karlsson@...el.com,
        netdev@...r.kernel.org, ilias.apalodimas@...aro.org
Subject: Re: [Intel-wired-lan] [PATCH bpf-next 00/11] AF_XDP: introducing
 zero-copy support

On Mon, 2018-06-04 at 09:38 -0700, Alexei Starovoitov wrote:
> On Mon, Jun 04, 2018 at 02:05:50PM +0200, Björn Töpel wrote:
> > From: Björn Töpel <bjorn.topel@...el.com>
> > 
> > This patch serie introduces zerocopy (ZC) support for
> > AF_XDP. Programs using AF_XDP sockets will now receive RX packets
> > without any copies and can also transmit packets without incurring
> > any
> > copies. No modifications to the application are needed, but the NIC
> > driver needs to be modified to support ZC. If ZC is not supported
> > by
> > the driver, the modes introduced in the AF_XDP patch will be
> > used. Using ZC in our micro benchmarks results in significantly
> > improved performance as can be seen in the performance section
> > later
> > in this cover letter.
> > 
> > Note that for an untrusted application, HW packet steering to a
> > specific queue pair (the one associated with the application) is a
> > requirement when using ZC, as the application would otherwise be
> > able
> > to see other user space processes' packets. If the HW cannot
> > support
> > the required packet steering you need to use the XDP_SKB mode or
> > the
> > XDP_DRV mode without ZC turned on. The XSKMAP introduced in the
> > AF_XDP
> > patch set can be used to do load balancing in that case.
> > 
> > For benchmarking, you can use the xdpsock application from the
> > AF_XDP
> > patch set without any modifications. Say that you would like your
> > UDP
> > traffic from port 4242 to end up in queue 16, that we will enable
> > AF_XDP on. Here, we use ethtool for this:
> > 
> >       ethtool -N p3p2 rx-flow-hash udp4 fn
> >       ethtool -N p3p2 flow-type udp4 src-port 4242 dst-port 4242 \
> >           action 16
> > 
> > Running the rxdrop benchmark in XDP_DRV mode with zerocopy can then
> > be
> > done using:
> > 
> >       samples/bpf/xdpsock -i p3p2 -q 16 -r -N
> > 
> > We have run some benchmarks on a dual socket system with two
> > Broadwell
> > E5 2660 @ 2.0 GHz with hyperthreading turned off. Each socket has
> > 14
> > cores which gives a total of 28, but only two cores are used in
> > these
> > experiments. One for TR/RX and one for the user space application.
> > The
> > memory is DDR4 @ 2133 MT/s (1067 MHz) and the size of each DIMM is
> > 8192MB and with 8 of those DIMMs in the system we have 64 GB of
> > total
> > memory. The compiler used is gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0.
> > The
> > NIC is Intel I40E 40Gbit/s using the i40e driver.
> > 
> > Below are the results in Mpps of the I40E NIC benchmark runs for 64
> > and 1500 byte packets, generated by a commercial packet generator
> > HW
> > outputing packets at full 40 Gbit/s line rate. The results are
> > without
> > retpoline so that we can compare against previous numbers. 
> > 
> > AF_XDP performance 64 byte packets. Results from the AF_XDP V3
> > patch
> > set are also reported for ease of reference. The numbers within
> > parantheses are from the RFC V1 ZC patch set.
> > Benchmark   XDP_SKB    XDP_DRV    XDP_DRV with zerocopy
> > rxdrop       2.9*       9.6*       21.1(21.5)
> > txpush       2.6*       -          22.0(21.6)
> > l2fwd        1.9*       2.5*       15.3(15.0)
> > 
> > AF_XDP performance 1500 byte packets:
> > Benchmark   XDP_SKB   XDP_DRV     XDP_DRV with zerocopy
> > rxdrop       2.1*       3.3*       3.3(3.3)
> > l2fwd        1.4*       1.8*       3.1(3.1)
> > 
> > * From AF_XDP V3 patch set and cover letter.
> > 
> > So why do we not get higher values for RX similar to the 34 Mpps we
> > had in AF_PACKET V4? We made an experiment running the rxdrop
> > benchmark without using the xdp_do_redirect/flush infrastructure
> > nor
> > using an XDP program (all traffic on a queue goes to one
> > socket). Instead the driver acts directly on the AF_XDP socket.
> > With
> > this we got 36.9 Mpps, a significant improvement without any change
> > to
> > the uapi. So not forcing users to have an XDP program if they do
> > not
> > need it, might be a good idea. This measurement is actually higher
> > than what we got with AF_PACKET V4.
> > 
> > XDP performance on our system as a base line:
> > 
> > 64 byte packets:
> > XDP stats       CPU     pps         issue-pps
> > XDP-RX CPU      16      32.3M  0
> > 
> > 1500 byte packets:
> > XDP stats       CPU     pps         issue-pps
> > XDP-RX CPU      16      3.3M    0
> > 
> > The structure of the patch set is as follows:
> > 
> > Patches 1-3: Plumbing for AF_XDP ZC support
> > Patches 4-5: AF_XDP ZC for RX
> > Patches 6-7: AF_XDP ZC for TX
> 
> Acked-by: Alexei Starovoitov <ast@...nel.org>
> for above patches
> 
> > Patch 8-10: ZC support for i40e.
> 
> these also look good to me.
> would be great if i40e experts take a look at them asap.
> 
> If there are no major objections we'd like to merge all of it
> for this merge window.

We would like a bit more time to review and test the changes, I
understand your eagerness for wanting this to get into 4.18 but this
change is large enough that a 24-48 hour review time is not prudent,
IMHO.

Alex also has requested for more time so that he can review the changes
as well.  I will go ahead and put the entire series in my tree so that
our validation team can start to "kick the tires".
Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ