lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ+HfNh5DWsT6uT9nvzPeUp=XFip5meDammfTXMdd4b6wDqqeQ@mail.gmail.com>
Date:   Tue, 14 Nov 2017 06:33:59 +0100
From:   Björn Töpel <bjorn.topel@...il.com>
To:     Alexei Starovoitov <ast@...com>
Cc:     "Karlsson, Magnus" <magnus.karlsson@...el.com>,
        "Duyck, Alexander H" <alexander.h.duyck@...el.com>,
        Alexander Duyck <alexander.duyck@...il.com>,
        John Fastabend <john.fastabend@...il.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        michael.lundkvist@...csson.com, ravineet.singh@...csson.com,
        Daniel Borkmann <daniel@...earbox.net>,
        Netdev <netdev@...r.kernel.org>,
        Willem de Bruijn <willemdebruijn.kernel@...il.com>,
        Tushar Dave <tushar.n.dave@...cle.com>, eric.dumazet@...il.com,
        Björn Töpel <bjorn.topel@...el.com>,
        jesse.brandeburg@...el.com, anjali.singhai@...el.com,
        rami.rosen@...el.com, jeffrey.b.shaw@...el.com,
        ferruh.yigit@...el.com, qi.z.zhang@...el.com, davem@...emloft.net
Subject: Re: [RFC PATCH 00/14] Introducing AF_PACKET V4 support

2017-11-14 0:50 GMT+01:00 Alexei Starovoitov <ast@...com>:
> On 11/13/17 9:07 PM, Björn Töpel wrote:
>>
>> 2017-10-31 13:41 GMT+01:00 Björn Töpel <bjorn.topel@...il.com>:
>>>
>>> From: Björn Töpel <bjorn.topel@...el.com>
>>>
>> [...]
>>>
>>>
>>> We'll do a presentation on AF_PACKET V4 in NetDev 2.2 [1] Seoul,
>>> Korea, and our paper with complete benchmarks will be released shortly
>>> on the NetDev 2.2 site.
>>>
>>
>> We're back in the saddle after an excellent netdevconf week. Kudos to
>> the organizers; We had a blast! Thanks for all the constructive
>> feedback.
>>
>> I'll summarize the major points, that we'll address in the next RFC
>> below.
>>
>> * Instead of extending AF_PACKET with yet another version, introduce a
>>   new address/packet family. As for naming had some name suggestions:
>>   AF_CAPTURE, AF_CHANNEL, AF_XDP and AF_ZEROCOPY. We'll go for
>>   AF_ZEROCOPY, unless there're no strong opinions against it.
>>
>> * No explicit zerocopy enablement. Use the zeropcopy path if
>>   supported, if not -- fallback to the skb path, for netdevs that
>>   don't support the required ndos. Further, we'll have the zerocopy
>>   behavior for the skb path as well, meaning that an AF_ZEROCOPY
>>   socket will consume the skb and we'll honor skb->queue_mapping,
>>   meaning that we only consume the packets for the enabled queue.
>>
>> * Limit the scope of the first patchset to Rx only, and introduce Tx
>>   in a separate patchset.
>
>
> all sounds good to me except above bit.
> I don't remember people suggesting to split it this way.
> What's the value of it without tx?
>

We definitely need Tx for our use-cases! I'll rephrase, so the
idea was making the initial patch set without Tx *driver*
specific code, e.g. use ndo_xdp_xmit/flush at a later point.

So AF_ZEROCOPY, the socket parts, would have Tx support.

@John Did I recall that correctly?

>> * Minimize the size of the i40e zerocopy patches, by moving the driver
>>   specific code to separate patches.
>>
>> * Do not introduce a new XDP action XDP_PASS_TO_KERNEL, instead use
>>   XDP redirect map call with ingress flag.
>>
>> * Extend the XDP redirect to support explicit allocator/destructor
>>   functions. Right now, XDP redirect assumes that the page allocator
>>   was used, and the XDP redirect cleanup path is decreasing the page
>>   count of the XDP buffer. This assumption breaks for the zerocopy
>>   case.
>>
>>
>> Björn
>>
>>
>>> We based this patch set on net-next commit e1ea2f9856b7 ("Merge
>>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
>>>
>>> Please focus your review on:
>>>
>>> * The V4 user space interface
>>> * PACKET_ZEROCOPY and its semantics
>>> * Packet array interface
>>> * XDP semantics when excuting in zero-copy mode (user space passed
>>>   buffers)
>>> * XDP_PASS_TO_KERNEL semantics
>>>
>>> To do:
>>>
>>> * Investigate the user-space ring structure’s performance problems
>>> * Continue the XDP integration into packet arrays
>>> * Optimize performance
>>> * SKB <-> V4 conversions in tp4a_populate & tp4a_flush
>>> * Packet buffer is unnecessarily pinned for virtual devices
>>> * Support shared packet buffers
>>> * Unify V4 and SKB receive path in I40E driver
>>> * Support for packets spanning multiple frames
>>> * Disassociate the packet array implementation from the V4 queue
>>>   structure
>>>
>>> We would really like to thank the reviewers of the limited
>>> distribution RFC for all their comments that have helped improve the
>>> interfaces and the code significantly: Alexei Starovoitov, Alexander
>>> Duyck, Jesper Dangaard Brouer, and John Fastabend. The internal team
>>> at Intel that has been helping out reviewing code, writing tests, and
>>> sanity checking our ideas: Rami Rosen, Jeff Shaw, Ferruh Yigit, and Qi
>>> Zhang, your participation has really helped.
>>>
>>> Thanks: Björn and Magnus
>>>
>>> [1]
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.netdevconf.org_2.2_&d=DwIFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=qR6oNZj1CqLATni4ibTgAQ&m=lKyFxON3kKygiOgECLBfmqRwM7ZyXFSUvLED1vP-gos&s=44jzm1W8xkGyZSZVANRygzHz6y4XHbYrYBRM-K5RhTc&e=
>>>
>>>
>>> Björn Töpel (7):
>>>   packet: introduce AF_PACKET V4 userspace API
>>>   packet: implement PACKET_MEMREG setsockopt
>>>   packet: enable AF_PACKET V4 rings
>>>   packet: wire up zerocopy for AF_PACKET V4
>>>   i40e: AF_PACKET V4 ndo_tp4_zerocopy Rx support
>>>   i40e: AF_PACKET V4 ndo_tp4_zerocopy Tx support
>>>   samples/tpacket4: added tpbench
>>>
>>> Magnus Karlsson (7):
>>>   packet: enable Rx for AF_PACKET V4
>>>   packet: enable Tx support for AF_PACKET V4
>>>   netdevice: add AF_PACKET V4 zerocopy ops
>>>   veth: added support for PACKET_ZEROCOPY
>>>   samples/tpacket4: added veth support
>>>   i40e: added XDP support for TP4 enabled queue pairs
>>>   xdp: introducing XDP_PASS_TO_KERNEL for PACKET_ZEROCOPY use
>>>
>>>  drivers/net/ethernet/intel/i40e/i40e.h         |    3 +
>>>  drivers/net/ethernet/intel/i40e/i40e_ethtool.c |    9 +
>>>  drivers/net/ethernet/intel/i40e/i40e_main.c    |  837 ++++++++++++-
>>>  drivers/net/ethernet/intel/i40e/i40e_txrx.c    |  582 ++++++++-
>>>  drivers/net/ethernet/intel/i40e/i40e_txrx.h    |   38 +
>>>  drivers/net/veth.c                             |  174 +++
>>>  include/linux/netdevice.h                      |   16 +
>>>  include/linux/tpacket4.h                       | 1502
>>> ++++++++++++++++++++++++
>>>  include/uapi/linux/bpf.h                       |    1 +
>>>  include/uapi/linux/if_packet.h                 |   65 +-
>>>  net/packet/af_packet.c                         | 1252
>>> +++++++++++++++++---
>>>  net/packet/internal.h                          |    9 +
>>>  samples/tpacket4/Makefile                      |   12 +
>>>  samples/tpacket4/bench_all.sh                  |   28 +
>>>  samples/tpacket4/tpbench.c                     | 1390
>>> ++++++++++++++++++++++
>>>  15 files changed, 5674 insertions(+), 244 deletions(-)
>>>  create mode 100644 include/linux/tpacket4.h
>>>  create mode 100644 samples/tpacket4/Makefile
>>>  create mode 100755 samples/tpacket4/bench_all.sh
>>>  create mode 100644 samples/tpacket4/tpbench.c
>>>
>>> --
>>> 2.11.0
>>>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ