lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190628132516.723ef517@cakuba.netronome.com>
Date:   Fri, 28 Jun 2019 13:25:16 -0700
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     "Laatz, Kevin" <kevin.laatz@...el.com>
Cc:     Jonathan Lemon <jonathan.lemon@...il.com>, netdev@...r.kernel.org,
        ast@...nel.org, daniel@...earbox.net, bjorn.topel@...el.com,
        magnus.karlsson@...el.com, bpf@...r.kernel.org,
        intel-wired-lan@...ts.osuosl.org, bruce.richardson@...el.com,
        ciara.loftus@...el.com
Subject: Re: [PATCH 00/11] XDP unaligned chunk placement support

On Fri, 28 Jun 2019 17:19:09 +0100, Laatz, Kevin wrote:
> On 27/06/2019 22:25, Jakub Kicinski wrote:
> > On Thu, 27 Jun 2019 12:14:50 +0100, Laatz, Kevin wrote:  
> >> On the application side (xdpsock), we don't have to worry about the user
> >> defined headroom, since it is 0, so we only need to account for the
> >> XDP_PACKET_HEADROOM when computing the original address (in the default
> >> scenario).  
> > That assumes specific layout for the data inside the buffer.  Some NICs
> > will prepend information like timestamp to the packet, meaning the
> > packet would start at offset XDP_PACKET_HEADROOM + metadata len..  
> 
> Yes, if NICs prepend extra data to the packet that would be a problem for
> using this feature in isolation. However, if we also add in support for 
> in-order RX and TX rings, that would no longer be an issue.

Can you shed more light on in-order rings?  Do you mean that RX frames
come in order buffers were placed in the fill queue?  That wouldn't
make practical sense, no?  Even if the application does no
reordering there is also XDP_DROP and XDP_TX.  Please explain :)

> However, even for NICs which do prepend data, this patchset should
> not break anything that is currently working.

My understanding from the beginnings of AF_XDP was that we were
searching for a format flexible enough to support most if not all NICs.
Creating an ABI which will preclude vendors from supporting DPDK via
AF_XDP would seriously undermine the neutrality aspect.

> > I think that's very limiting.  What is the challenge in providing
> > aligned addresses, exactly?  
> The challenges are two-fold:
> 1) it prevents using arbitrary buffer sizes, which will be an issue 
> supporting e.g. jumbo frames in future.

Presumably support for jumbos would require a multi-buffer setup, and
therefore extensions to the ring format. Should we perhaps look into
implementing unaligned chunks by extending ring format as well?

> 2) higher level user-space frameworks which may want to use AF_XDP, such 
> as DPDK, do not currently support having buffers with 'fixed' alignment.
>      The reason that DPDK uses arbitrary placement is that:
>          - it would stop things working on certain NICs which need the 
> actual writable space specified in units of 1k - therefore we need 2k + 
> metadata space.
>          - we place padding between buffers to avoid constantly hitting 
> the same memory channels when accessing memory.
>          - it allows the application to choose the actual buffer size it 
> wants to use.
>      We make use of the above to allow us to speed up processing 
> significantly and also reduce the packet buffer memory size.
> 
>      Not having arbitrary buffer alignment also means an AF_XDP driver 
> for DPDK cannot be a drop-in replacement for existing drivers in those 
> frameworks. Even with a new capability to allow an arbitrary buffer 
> alignment, existing apps will need to be modified to use that new 
> capability.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ