lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 13 Nov 2017 22:07:47 +0900
From:   Björn Töpel <bjorn.topel@...il.com>
To:     Bjorn Topel <bjorn.topel@...il.com>,
        "Karlsson, Magnus" <magnus.karlsson@...el.com>,
        "Duyck, Alexander H" <alexander.h.duyck@...el.com>,
        Alexander Duyck <alexander.duyck@...il.com>,
        John Fastabend <john.fastabend@...il.com>,
        Alexei Starovoitov <ast@...com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        michael.lundkvist@...csson.com, ravineet.singh@...csson.com,
        Daniel Borkmann <daniel@...earbox.net>,
        Netdev <netdev@...r.kernel.org>,
        Willem de Bruijn <willemdebruijn.kernel@...il.com>,
        Tushar Dave <tushar.n.dave@...cle.com>, eric.dumazet@...il.com
Cc:     Björn Töpel <bjorn.topel@...el.com>,
        jesse.brandeburg@...el.com, anjali.singhai@...el.com,
        rami.rosen@...el.com, jeffrey.b.shaw@...el.com,
        ferruh.yigit@...el.com, qi.z.zhang@...el.com, davem@...emloft.net
Subject: Re: [RFC PATCH 00/14] Introducing AF_PACKET V4 support

2017-10-31 13:41 GMT+01:00 Björn Töpel <bjorn.topel@...il.com>:
> From: Björn Töpel <bjorn.topel@...el.com>
>
[...]
>
> We'll do a presentation on AF_PACKET V4 in NetDev 2.2 [1] Seoul,
> Korea, and our paper with complete benchmarks will be released shortly
> on the NetDev 2.2 site.
>

We're back in the saddle after an excellent netdevconf week. Kudos to
the organizers; We had a blast! Thanks for all the constructive
feedback.

I'll summarize the major points, that we'll address in the next RFC
below.

* Instead of extending AF_PACKET with yet another version, introduce a
  new address/packet family. As for naming had some name suggestions:
  AF_CAPTURE, AF_CHANNEL, AF_XDP and AF_ZEROCOPY. We'll go for
  AF_ZEROCOPY, unless there're no strong opinions against it.

* No explicit zerocopy enablement. Use the zeropcopy path if
  supported, if not -- fallback to the skb path, for netdevs that
  don't support the required ndos. Further, we'll have the zerocopy
  behavior for the skb path as well, meaning that an AF_ZEROCOPY
  socket will consume the skb and we'll honor skb->queue_mapping,
  meaning that we only consume the packets for the enabled queue.

* Limit the scope of the first patchset to Rx only, and introduce Tx
  in a separate patchset.

* Minimize the size of the i40e zerocopy patches, by moving the driver
  specific code to separate patches.

* Do not introduce a new XDP action XDP_PASS_TO_KERNEL, instead use
  XDP redirect map call with ingress flag.

* Extend the XDP redirect to support explicit allocator/destructor
  functions. Right now, XDP redirect assumes that the page allocator
  was used, and the XDP redirect cleanup path is decreasing the page
  count of the XDP buffer. This assumption breaks for the zerocopy
  case.


Björn


> We based this patch set on net-next commit e1ea2f9856b7 ("Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
>
> Please focus your review on:
>
> * The V4 user space interface
> * PACKET_ZEROCOPY and its semantics
> * Packet array interface
> * XDP semantics when excuting in zero-copy mode (user space passed
>   buffers)
> * XDP_PASS_TO_KERNEL semantics
>
> To do:
>
> * Investigate the user-space ring structure’s performance problems
> * Continue the XDP integration into packet arrays
> * Optimize performance
> * SKB <-> V4 conversions in tp4a_populate & tp4a_flush
> * Packet buffer is unnecessarily pinned for virtual devices
> * Support shared packet buffers
> * Unify V4 and SKB receive path in I40E driver
> * Support for packets spanning multiple frames
> * Disassociate the packet array implementation from the V4 queue
>   structure
>
> We would really like to thank the reviewers of the limited
> distribution RFC for all their comments that have helped improve the
> interfaces and the code significantly: Alexei Starovoitov, Alexander
> Duyck, Jesper Dangaard Brouer, and John Fastabend. The internal team
> at Intel that has been helping out reviewing code, writing tests, and
> sanity checking our ideas: Rami Rosen, Jeff Shaw, Ferruh Yigit, and Qi
> Zhang, your participation has really helped.
>
> Thanks: Björn and Magnus
>
> [1] https://www.netdevconf.org/2.2/
>
> Björn Töpel (7):
>   packet: introduce AF_PACKET V4 userspace API
>   packet: implement PACKET_MEMREG setsockopt
>   packet: enable AF_PACKET V4 rings
>   packet: wire up zerocopy for AF_PACKET V4
>   i40e: AF_PACKET V4 ndo_tp4_zerocopy Rx support
>   i40e: AF_PACKET V4 ndo_tp4_zerocopy Tx support
>   samples/tpacket4: added tpbench
>
> Magnus Karlsson (7):
>   packet: enable Rx for AF_PACKET V4
>   packet: enable Tx support for AF_PACKET V4
>   netdevice: add AF_PACKET V4 zerocopy ops
>   veth: added support for PACKET_ZEROCOPY
>   samples/tpacket4: added veth support
>   i40e: added XDP support for TP4 enabled queue pairs
>   xdp: introducing XDP_PASS_TO_KERNEL for PACKET_ZEROCOPY use
>
>  drivers/net/ethernet/intel/i40e/i40e.h         |    3 +
>  drivers/net/ethernet/intel/i40e/i40e_ethtool.c |    9 +
>  drivers/net/ethernet/intel/i40e/i40e_main.c    |  837 ++++++++++++-
>  drivers/net/ethernet/intel/i40e/i40e_txrx.c    |  582 ++++++++-
>  drivers/net/ethernet/intel/i40e/i40e_txrx.h    |   38 +
>  drivers/net/veth.c                             |  174 +++
>  include/linux/netdevice.h                      |   16 +
>  include/linux/tpacket4.h                       | 1502 ++++++++++++++++++++++++
>  include/uapi/linux/bpf.h                       |    1 +
>  include/uapi/linux/if_packet.h                 |   65 +-
>  net/packet/af_packet.c                         | 1252 +++++++++++++++++---
>  net/packet/internal.h                          |    9 +
>  samples/tpacket4/Makefile                      |   12 +
>  samples/tpacket4/bench_all.sh                  |   28 +
>  samples/tpacket4/tpbench.c                     | 1390 ++++++++++++++++++++++
>  15 files changed, 5674 insertions(+), 244 deletions(-)
>  create mode 100644 include/linux/tpacket4.h
>  create mode 100644 samples/tpacket4/Makefile
>  create mode 100755 samples/tpacket4/bench_all.sh
>  create mode 100644 samples/tpacket4/tpbench.c
>
> --
> 2.11.0
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ