lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 26 Jun 2019 09:42:07 -0700
From:   "Jonathan Lemon" <jonathan.lemon@...il.com>
To:     "Willem de Bruijn" <willemdebruijn.kernel@...il.com>
Cc:     "Toke Høiland-Jørgensen" <toke@...hat.com>,
        "Jesper Dangaard Brouer" <brouer@...hat.com>,
        "Machulsky, Zorik" <zorik@...zon.com>,
        "Jubran, Samih" <sameehj@...zon.com>, davem@...emloft.net,
        netdev@...r.kernel.org, "Woodhouse, David" <dwmw@...zon.co.uk>,
        "Matushevsky, Alexander" <matua@...zon.com>,
        "Bshara, Saeed" <saeedb@...zon.com>,
        "Wilson, Matt" <msw@...zon.com>,
        "Liguori, Anthony" <aliguori@...zon.com>,
        "Bshara, Nafea" <nafea@...zon.com>,
        "Tzalik, Guy" <gtzalik@...zon.com>,
        "Belgazal, Netanel" <netanel@...zon.com>,
        "Saidi, Ali" <alisaidi@...zon.com>,
        "Herrenschmidt, Benjamin" <benh@...zon.com>,
        "Kiyanovski, Arthur" <akiyano@...zon.com>,
        "Daniel Borkmann" <borkmann@...earbox.net>,
        "Ilias Apalodimas" <ilias.apalodimas@...aro.org>,
        "Alexei Starovoitov" <alexei.starovoitov@...il.com>,
        "Jakub Kicinski" <jakub.kicinski@...ronome.com>,
        xdp-newbies@...r.kernel.org
Subject: Re: XDP multi-buffer incl. jumbo-frames (Was: [RFC V1 net-next 1/1]
 net: ena: implement XDP drop support)



On 26 Jun 2019, at 8:20, Willem de Bruijn wrote:

> On Wed, Jun 26, 2019 at 11:01 AM Toke Høiland-Jørgensen 
> <toke@...hat.com> wrote:
>>
>> Jesper Dangaard Brouer <brouer@...hat.com> writes:
>>
>>> On Wed, 26 Jun 2019 13:52:16 +0200
>>> Toke Høiland-Jørgensen <toke@...hat.com> wrote:
>>>
>>>> Jesper Dangaard Brouer <brouer@...hat.com> writes:
>>>>
>>>>> On Tue, 25 Jun 2019 03:19:22 +0000
>>>>> "Machulsky, Zorik" <zorik@...zon.com> wrote:
>>>>>
>>>>>> On 6/23/19, 7:21 AM, "Jesper Dangaard Brouer" 
>>>>>> <brouer@...hat.com> wrote:
>>>>>>
>>>>>>     On Sun, 23 Jun 2019 10:06:49 +0300 <sameehj@...zon.com> 
>>>>>> wrote:
>>>>>>
>>>>>>     > This commit implements the basic functionality of drop/pass 
>>>>>> logic in the
>>>>>>     > ena driver.
>>>>>>
>>>>>>     Usually we require a driver to implement all the XDP return 
>>>>>> codes,
>>>>>>     before we accept it.  But as Daniel and I discussed with 
>>>>>> Zorik during
>>>>>>     NetConf[1], we are going to make an exception and accept the 
>>>>>> driver
>>>>>>     if you also implement XDP_TX.
>>>>>>
>>>>>>     As we trust that Zorik/Amazon will follow and implement 
>>>>>> XDP_REDIRECT
>>>>>>     later, given he/you wants AF_XDP support which requires 
>>>>>> XDP_REDIRECT.
>>>>>>
>>>>>> Jesper, thanks for your comments and very helpful discussion 
>>>>>> during
>>>>>> NetConf! That's the plan, as we agreed. From our side I would 
>>>>>> like to
>>>>>> reiterate again the importance of multi-buffer support by xdp 
>>>>>> frame.
>>>>>> We would really prefer not to see our MTU shrinking because of 
>>>>>> xdp
>>>>>> support.
>>>>>
>>>>> Okay we really need to make a serious attempt to find a way to 
>>>>> support
>>>>> multi-buffer packets with XDP. With the important criteria of not
>>>>> hurting performance of the single-buffer per packet design.
>>>>>
>>>>> I've created a design document[2], that I will update based on our
>>>>> discussions: [2] 
>>>>> https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org
>>>>>
>>>>> The use-case that really convinced me was Eric's packet 
>>>>> header-split.
>
> Thanks for starting this discussion Jesper!
>
>>>>>
>>>>>
>>>>> Lets refresh: Why XDP don't have multi-buffer support:
>>>>>
>>>>> XDP is designed for maximum performance, which is why certain 
>>>>> driver-level
>>>>> use-cases were not supported, like multi-buffer packets (like 
>>>>> jumbo-frames).
>>>>> As it e.g. complicated the driver RX-loop and memory model 
>>>>> handling.
>>>>>
>>>>> The single buffer per packet design, is also tied into eBPF 
>>>>> Direct-Access
>>>>> (DA) to packet data, which can only be allowed if the packet 
>>>>> memory is in
>>>>> contiguous memory.  This DA feature is essential for XDP 
>>>>> performance.
>>>>>
>>>>>
>>>>> One way forward is to define that XDP only get access to the first
>>>>> packet buffer, and it cannot see subsequent buffers. For XDP_TX 
>>>>> and
>>>>> XDP_REDIRECT to work then XDP still need to carry pointers (plus
>>>>> len+offset) to the other buffers, which is 16 bytes per extra 
>>>>> buffer.
>>>>
>>>> Yeah, I think this would be reasonable. As long as we can have a
>>>> metadata field with the full length + still give XDP programs the
>>>> ability to truncate the packet (i.e., discard the subsequent pages)
>>>
>>> You touch upon some interesting complications already:
>>>
>>> 1. It is valuable for XDP bpf_prog to know "full" length?
>>>    (if so, then we need to extend xdp ctx with info)
>>
>> Valuable, quite likely. A hard requirement, probably not (for all use
>> cases).
>
> Agreed.
>
> One common validation use would be to drop any packets whose header
> length disagrees with the actual packet length.
>
>>>  But if we need to know the full length, when the first-buffer is
>>>  processed. Then realize that this affect the drivers RX-loop, 
>>> because
>>>  then we need to "collect" all the buffers before we can know the
>>>  length (although some HW provide this in first descriptor).
>>>
>>>  We likely have to change drivers RX-loop anyhow, as XDP_TX and
>>>  XDP_REDIRECT will also need to "collect" all buffers before the 
>>> packet
>>>  can be forwarded. (Although this could potentially happen later in
>>>  driver loop when it meet/find the End-Of-Packet descriptor bit).
>
> Yes, this might be quite a bit of refactoring of device driver code.
>
> Should we move forward with some initial constraints, e.g., no
> XDP_REDIRECT, no "full" length and no bpf_xdp_adjust_tail?
>
> That already allows many useful programs.
>
> As long as we don't arrive at a design that cannot be extended with
> those features later.

I think collecting all frames until EOP and then processing them
at once sounds reasonable.



>>> 2. Can we even allow helper bpf_xdp_adjust_tail() ?
>>>
>>>  Wouldn't it be easier to disallow a BPF-prog with this helper, when
>>>  driver have configured multi-buffer?
>>
>> Easier, certainly. But then it's even easier to not implement this at
>> all ;)
>>
>>>  Or will it be too restrictive, if jumbo-frame is very uncommon and
>>>  only enabled because switch infra could not be changed (like Amazon
>>>  case).
>
> Header-split, LRO and jumbo frame are certainly not limited to the 
> Amazon case.
>
>> I think it would be preferable to support it; but maybe we can let 
>> that
>> depend on how difficult it actually turns out to be to allow it?
>>
>>>  Perhaps it is better to let bpf_xdp_adjust_tail() fail runtime?
>>
>> If we do disallow it, I think I'd lean towards failing the call at
>> runtime...
>
> Disagree. I'd rather have a program fail at load if it depends on
> multi-frag support while the (driver) implementation does not yet
> support it.

If all packets are collected together (like the bulk queue does), and 
then
passed to XDP, this could easily be made backwards compatible.  If the 
XDP
program isn't 'multi-frag' aware, then each packet is just passed in 
individually.

Of course, passing in the equivalent of a iovec requires some form of 
loop
support on the BPF side, doesn't it?
-- 
Jonathan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ