[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ+HfNjf9kZJUnD96gOhPgHKW9rGdtFfn9MFpmjaQ32JpR2MQQ@mail.gmail.com>
Date: Mon, 13 Nov 2017 22:07:47 +0900
From: Björn Töpel <bjorn.topel@...il.com>
To: Bjorn Topel <bjorn.topel@...il.com>,
"Karlsson, Magnus" <magnus.karlsson@...el.com>,
"Duyck, Alexander H" <alexander.h.duyck@...el.com>,
Alexander Duyck <alexander.duyck@...il.com>,
John Fastabend <john.fastabend@...il.com>,
Alexei Starovoitov <ast@...com>,
Jesper Dangaard Brouer <brouer@...hat.com>,
michael.lundkvist@...csson.com, ravineet.singh@...csson.com,
Daniel Borkmann <daniel@...earbox.net>,
Netdev <netdev@...r.kernel.org>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
Tushar Dave <tushar.n.dave@...cle.com>, eric.dumazet@...il.com
Cc: Björn Töpel <bjorn.topel@...el.com>,
jesse.brandeburg@...el.com, anjali.singhai@...el.com,
rami.rosen@...el.com, jeffrey.b.shaw@...el.com,
ferruh.yigit@...el.com, qi.z.zhang@...el.com, davem@...emloft.net
Subject: Re: [RFC PATCH 00/14] Introducing AF_PACKET V4 support
2017-10-31 13:41 GMT+01:00 Björn Töpel <bjorn.topel@...il.com>:
> From: Björn Töpel <bjorn.topel@...el.com>
>
[...]
>
> We'll do a presentation on AF_PACKET V4 in NetDev 2.2 [1] Seoul,
> Korea, and our paper with complete benchmarks will be released shortly
> on the NetDev 2.2 site.
>
We're back in the saddle after an excellent netdevconf week. Kudos to
the organizers; We had a blast! Thanks for all the constructive
feedback.
I'll summarize the major points, that we'll address in the next RFC
below.
* Instead of extending AF_PACKET with yet another version, introduce a
new address/packet family. As for naming had some name suggestions:
AF_CAPTURE, AF_CHANNEL, AF_XDP and AF_ZEROCOPY. We'll go for
AF_ZEROCOPY, unless there're no strong opinions against it.
* No explicit zerocopy enablement. Use the zeropcopy path if
supported, if not -- fallback to the skb path, for netdevs that
don't support the required ndos. Further, we'll have the zerocopy
behavior for the skb path as well, meaning that an AF_ZEROCOPY
socket will consume the skb and we'll honor skb->queue_mapping,
meaning that we only consume the packets for the enabled queue.
* Limit the scope of the first patchset to Rx only, and introduce Tx
in a separate patchset.
* Minimize the size of the i40e zerocopy patches, by moving the driver
specific code to separate patches.
* Do not introduce a new XDP action XDP_PASS_TO_KERNEL, instead use
XDP redirect map call with ingress flag.
* Extend the XDP redirect to support explicit allocator/destructor
functions. Right now, XDP redirect assumes that the page allocator
was used, and the XDP redirect cleanup path is decreasing the page
count of the XDP buffer. This assumption breaks for the zerocopy
case.
Björn
> We based this patch set on net-next commit e1ea2f9856b7 ("Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
>
> Please focus your review on:
>
> * The V4 user space interface
> * PACKET_ZEROCOPY and its semantics
> * Packet array interface
> * XDP semantics when excuting in zero-copy mode (user space passed
> buffers)
> * XDP_PASS_TO_KERNEL semantics
>
> To do:
>
> * Investigate the user-space ring structure’s performance problems
> * Continue the XDP integration into packet arrays
> * Optimize performance
> * SKB <-> V4 conversions in tp4a_populate & tp4a_flush
> * Packet buffer is unnecessarily pinned for virtual devices
> * Support shared packet buffers
> * Unify V4 and SKB receive path in I40E driver
> * Support for packets spanning multiple frames
> * Disassociate the packet array implementation from the V4 queue
> structure
>
> We would really like to thank the reviewers of the limited
> distribution RFC for all their comments that have helped improve the
> interfaces and the code significantly: Alexei Starovoitov, Alexander
> Duyck, Jesper Dangaard Brouer, and John Fastabend. The internal team
> at Intel that has been helping out reviewing code, writing tests, and
> sanity checking our ideas: Rami Rosen, Jeff Shaw, Ferruh Yigit, and Qi
> Zhang, your participation has really helped.
>
> Thanks: Björn and Magnus
>
> [1] https://www.netdevconf.org/2.2/
>
> Björn Töpel (7):
> packet: introduce AF_PACKET V4 userspace API
> packet: implement PACKET_MEMREG setsockopt
> packet: enable AF_PACKET V4 rings
> packet: wire up zerocopy for AF_PACKET V4
> i40e: AF_PACKET V4 ndo_tp4_zerocopy Rx support
> i40e: AF_PACKET V4 ndo_tp4_zerocopy Tx support
> samples/tpacket4: added tpbench
>
> Magnus Karlsson (7):
> packet: enable Rx for AF_PACKET V4
> packet: enable Tx support for AF_PACKET V4
> netdevice: add AF_PACKET V4 zerocopy ops
> veth: added support for PACKET_ZEROCOPY
> samples/tpacket4: added veth support
> i40e: added XDP support for TP4 enabled queue pairs
> xdp: introducing XDP_PASS_TO_KERNEL for PACKET_ZEROCOPY use
>
> drivers/net/ethernet/intel/i40e/i40e.h | 3 +
> drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 9 +
> drivers/net/ethernet/intel/i40e/i40e_main.c | 837 ++++++++++++-
> drivers/net/ethernet/intel/i40e/i40e_txrx.c | 582 ++++++++-
> drivers/net/ethernet/intel/i40e/i40e_txrx.h | 38 +
> drivers/net/veth.c | 174 +++
> include/linux/netdevice.h | 16 +
> include/linux/tpacket4.h | 1502 ++++++++++++++++++++++++
> include/uapi/linux/bpf.h | 1 +
> include/uapi/linux/if_packet.h | 65 +-
> net/packet/af_packet.c | 1252 +++++++++++++++++---
> net/packet/internal.h | 9 +
> samples/tpacket4/Makefile | 12 +
> samples/tpacket4/bench_all.sh | 28 +
> samples/tpacket4/tpbench.c | 1390 ++++++++++++++++++++++
> 15 files changed, 5674 insertions(+), 244 deletions(-)
> create mode 100644 include/linux/tpacket4.h
> create mode 100644 samples/tpacket4/Makefile
> create mode 100755 samples/tpacket4/bench_all.sh
> create mode 100644 samples/tpacket4/tpbench.c
>
> --
> 2.11.0
>
Powered by blists - more mailing lists