lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ+HfNih0O5ZHpcyV1XacToG+uZJGfO_8nK2=W4XgK2mSaTgWQ@mail.gmail.com>
Date:   Tue, 14 Nov 2017 20:01:01 +0100
From:   Björn Töpel <bjorn.topel@...il.com>
To:     Jesper Dangaard Brouer <brouer@...hat.com>
Cc:     "Karlsson, Magnus" <magnus.karlsson@...el.com>,
        "Duyck, Alexander H" <alexander.h.duyck@...el.com>,
        Alexander Duyck <alexander.duyck@...il.com>,
        John Fastabend <john.fastabend@...il.com>,
        Alexei Starovoitov <ast@...com>,
        michael.lundkvist@...csson.com, ravineet.singh@...csson.com,
        Daniel Borkmann <daniel@...earbox.net>,
        Netdev <netdev@...r.kernel.org>,
        Willem de Bruijn <willemdebruijn.kernel@...il.com>,
        Tushar Dave <tushar.n.dave@...cle.com>, eric.dumazet@...il.com,
        Björn Töpel <bjorn.topel@...el.com>,
        jesse.brandeburg@...el.com, anjali.singhai@...el.com,
        rami.rosen@...el.com, jeffrey.b.shaw@...el.com,
        ferruh.yigit@...el.com, qi.z.zhang@...el.com, davem@...emloft.net
Subject: Re: [RFC PATCH 00/14] Introducing AF_PACKET V4 support (AF_XDP or AF_CHANNEL?)

2017-11-14 18:19 GMT+01:00 Jesper Dangaard Brouer <brouer@...hat.com>:
>
> On Mon, 13 Nov 2017 22:07:47 +0900 Björn Töpel <bjorn.topel@...il.com> wrote:
>
>> I'll summarize the major points, that we'll address in the next RFC
>> below.
>>
>> * Instead of extending AF_PACKET with yet another version, introduce a
>>   new address/packet family. As for naming had some name suggestions:
>>   AF_CAPTURE, AF_CHANNEL, AF_XDP and AF_ZEROCOPY. We'll go for
>>   AF_ZEROCOPY, unless there're no strong opinions against it.
>
> I mostly like AF_CHANNEL and AF_XDP. I do know XDP is/have-evolved-into
> a kernel-side facility, that moves XDP-frames/packets _inside_ the
> kernel.
>
> *BUT* I've always imagined, that we would create a "channel" to
> userspace.  By using XDP_REDIRECT to choose what frames get redirected
> into which userspace "channel" (new channel-map type).  Userspace
> pre-allocate and register memory/pages exactly like this patchset.
>
> [Step-1]: (non-ZC) XDP_REDIRECT need to copy frame-data into userspace
> memory pages.  And update your packet_array etc. (Use map-flush to get
> RX bulking).
>
> [Step 2]: (ZC) Userspace call driver NDO to register pages. The
> XDP_REDIRECT action happens in driver, and can have knowledge about
> RX-ring.  It can know if this RX-ring is Zero-Copy enabled and can skip
> the copy-step.
>

Jesper, I *really* like this approach -- especially the fact that the
existing XDP path in the drivers can be reused. I'll spend some time
dissecting the details of your suggestion.

>> * No explicit zerocopy enablement. Use the zeropcopy path if
>>   supported, if not -- fallback to the skb path, for netdevs that
>>   don't support the required ndos.
>
> When driver does not support NDO in above model. I think, that there
> will still be a significant performance boost for the non-ZC variant.
> Even-though we need a copy-operation, because there are no memory
> allocations.  As userspace have preallocated and registered pages with
> the kernel (and mem-limits are implicit via mem-size reg by userspace).
>

Yup, and we're not paying for the whole skb creation, given that we
execute from XDP_DRV and not XDP_SKB.

>> * Do not introduce a new XDP action XDP_PASS_TO_KERNEL, instead use
>>   XDP redirect map call with ingress flag.
>
> In above model, XDP_REDIRECT is used for filtering into a userspace
> "channel".  If ZC gets enabled on a RX-ring queue, then XDP_PASS have
> to do a copy (RX-ring knowledge is avail), like you describe with
> XDP_PASS_TO_KERNEL.
>

Again, this fits nicely in.

>> * Extend the XDP redirect to support explicit allocator/destructor
>>   functions. Right now, XDP redirect assumes that the page allocator
>>   was used, and the XDP redirect cleanup path is decreasing the page
>>   count of the XDP buffer. This assumption breaks for the zerocopy
>>   case.
>
> Yes, please.  If XDP_REDIRECT get call a destructor call-back, then we
> can allow XDP_REDIRECT out another net_device, even-when ZC is enabled
> on a RX-ring queue.
>
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ