lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-KaAXR8sgJrZShv2hNuKb0dRUbjR=r+8ozBGZycU0EQ+A@mail.gmail.com>
Date:   Thu, 8 Feb 2018 18:16:48 -0500
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Björn Töpel <bjorn.topel@...il.com>
Cc:     "Karlsson, Magnus" <magnus.karlsson@...el.com>,
        "Duyck, Alexander H" <alexander.h.duyck@...el.com>,
        Alexander Duyck <alexander.duyck@...il.com>,
        John Fastabend <john.fastabend@...il.com>,
        Alexei Starovoitov <ast@...com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Netdev <netdev@...r.kernel.org>,
        Björn Töpel <bjorn.topel@...el.com>,
        michael.lundkvist@...csson.com,
        "Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
        "Singhai, Anjali" <anjali.singhai@...el.com>,
        "Shaw, Jeffrey B" <jeffrey.b.shaw@...el.com>,
        "Yigit, Ferruh" <ferruh.yigit@...el.com>,
        "Zhang, Qi Z" <qi.z.zhang@...el.com>
Subject: Re: [RFC PATCH 00/24] Introducing AF_XDP support

On Wed, Feb 7, 2018 at 4:28 PM, Björn Töpel <bjorn.topel@...il.com> wrote:
> 2018-02-07 16:54 GMT+01:00 Willem de Bruijn <willemdebruijn.kernel@...il.com>:
>>> We realized, a bit late maybe, that 24 patches is a bit mouthful, so
>>> let me try to make it more palatable.
>>
>> Overall, this approach looks great to me.
>>
>
> Yay! :-)
>
>> The patch set incorporates all the feedback from AF_PACKET V4.
>> At this point I don't have additional high-level interface comments.
>>
>
> I have a thought on the socket API. Now, we're registering buffer
> memory *to* the kernel, but mmap:ing the Rx/Tx rings *from* the
> kernel. I'm leaning towards removing the mmap call, in favor of
> registering the rings to kernel analogous to the XDP_MEM_REG socket
> option. We wont guarantee physical contiguous memory for the rings,
> but I think we can live with that. Thoughts?
>
>> As you point out, 24 patches and nearly 6000 changed lines is
>> quite a bit to ingest. Splitting up in smaller patch sets will help
>> give more detailed implementation feedback.
>>
>> The frame pool and device driver changes are largely independent
>> from AF_XDP and probably should be resolved first (esp. the
>> observed regresssion even without AF_XDP).
>>
>
> Yeah, the regression is unacceptable.
>
> Another way is starting with the patches without zero-copy first
> (i.e. the copy path), and later add the driver modifications. That
> would be the first 7 patches.
>
>> As you suggest, it would be great if the need for a separate
>> xsk_packet_array data structure can be avoided.
>>
>
> Yes, we'll address that!
>
>> Since frames from the same frame pool can be forwarded between
>> multiple device ports and thus AF_XDP sockets, that should perhaps
>> be a separate object independent from the sockets. This comment
>> hints at the awkward situation if tied to a descriptor pair:
>>
>>> +       /* Check if umem is from this socket, if so do not make
>>> +        * circular references.
>>> +        */
>>
>> Since this is in principle just a large shared memory area, could
>> it reuse existing BPF map logic?
>>
>
> Hmm, care to elaborate on your thinking here?

On second thought, that is not workable. I was thinking of reusing
existing mmap support for maps, but that is limited to the perf ring
buffer.

>> More extreme, and perhaps unrealistic, is if the descriptor ring
>> could similarly be a BPF map and the Rx XDP program directly
>> writes the descriptor, instead of triggering xdp_do_xsk_redirect.
>> As we discussed before, this would avoid the need to specify a
>> descriptor format upfront.
>
> Having the XDP program writeback the descriptor to user space ring is
> really something that would be useful (writing a virtio-net
> descriptors...).

Yes, that's a great use case. This ties in with Jason Wang's
presentation on XDP with tap and virtio, too.

https://www.netdevconf.org/2.2/slides/wang-vmperformance-talk.pdf

> I need to think a bit more about this. :-) Please
> share your ideas!
>
> Thanks for looking into the patches!
>
>
> Björn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ