[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4b3f1e4c-7db6-d658-aba1-40237b9aa053@iogearbox.net>
Date:   Sat, 31 Aug 2019 01:29:37 +0200
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Kevin Laatz <kevin.laatz@...el.com>, netdev@...r.kernel.org,
        ast@...nel.org, bjorn.topel@...el.com, magnus.karlsson@...el.com,
        jakub.kicinski@...ronome.com, jonathan.lemon@...il.com,
        saeedm@...lanox.com, maximmi@...lanox.com,
        stephen@...workplumber.org
Cc:     bruce.richardson@...el.com, ciara.loftus@...el.com,
        bpf@...r.kernel.org, intel-wired-lan@...ts.osuosl.org
Subject: Re: [PATCH bpf-next v6 00/12] XDP unaligned chunk placement support
On 8/27/19 4:25 AM, Kevin Laatz wrote:
> This patch set adds the ability to use unaligned chunks in the XDP umem.
> 
> Currently, all chunk addresses passed to the umem are masked to be chunk
> size aligned (max is PAGE_SIZE). This limits where we can place chunks
> within the umem as well as limiting the packet sizes that are supported.
> 
> The changes in this patch set removes these restrictions, allowing XDP to
> be more flexible in where it can place a chunk within a umem. By relaxing
> where the chunks can be placed, it allows us to use an arbitrary buffer
> size and place that wherever we have a free address in the umem. These
> changes add the ability to support arbitrary frame sizes up to 4k
> (PAGE_SIZE) and make it easy to integrate with other existing frameworks
> that have their own memory management systems, such as DPDK.
> In DPDK, for example, there is already support for AF_XDP with zero-copy.
> However, with this patch set the integration will be much more seamless.
> You can find the DPDK AF_XDP driver at:
> https://git.dpdk.org/dpdk/tree/drivers/net/af_xdp
> 
> Since we are now dealing with arbitrary frame sizes, we need also need to
> update how we pass around addresses. Currently, the addresses can simply be
> masked to 2k to get back to the original address. This becomes less trivial
> when using frame sizes that are not a 'power of 2' size. This patch set
> modifies the Rx/Tx descriptor format to use the upper 16-bits of the addr
> field for an offset value, leaving the lower 48-bits for the address (this
> leaves us with 256 Terabytes, which should be enough!). We only need to use
> the upper 16-bits to store the offset when running in unaligned mode.
> Rather than adding the offset (headroom etc) to the address, we will store
> it in the upper 16-bits of the address field. This way, we can easily add
> the offset to the address where we need it, using some bit manipulation and
> addition, and we can also easily get the original address wherever we need
> it (for example in i40e_zca_free) by simply masking to get the lower
> 48-bits of the address field.
> 
> The patch set was tested with the following set up:
>    - Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
>    - Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02)
>    - Driver: i40e
>    - Application: xdpsock with l2fwd (single interface)
>    - Turbo disabled in BIOS
> 
> There are no changes to performance before and after these patches for SKB
> mode and Copy mode. Zero-copy mode saw a performance degradation of ~1.5%.
> 
> This patch set has been applied against
> commit 0bb52b0dfc88 ("tools: bpftool: add 'bpftool map freeze' subcommand")
> 
> Structure of the patch set:
> Patch 1:
>    - Remove unnecessary masking and headroom addition during zero-copy Rx
>      buffer recycling in i40e. This change is required in order for the
>      buffer recycling to work in the unaligned chunk mode.
> 
> Patch 2:
>    - Remove unnecessary masking and headroom addition during
>      zero-copy Rx buffer recycling in ixgbe. This change is required in
>      order for the  buffer recycling to work in the unaligned chunk mode.
> 
> Patch 3:
>    - Add infrastructure for unaligned chunks. Since we are dealing with
>      unaligned chunks that could potentially cross a physical page boundary,
>      we add checks to keep track of that information. We can later use this
>      information to correctly handle buffers that are placed at an address
>      where they cross a page boundary.  This patch also modifies the
>      existing Rx and Tx functions to use the new descriptor format. To
>      handle addresses correctly, we need to mask appropriately based on
>      whether we are in aligned or unaligned mode.
> 
> Patch 4:
>    - This patch updates the i40e driver to make use of the new descriptor
>      format.
> 
> Patch 5:
>    - This patch updates the ixgbe driver to make use of the new descriptor
>      format.
> 
> Patch 6:
>    - This patch updates the mlx5e driver to make use of the new descriptor
>      format. These changes are required to handle the new descriptor format
>      and for unaligned chunks support.
> 
> Patch 7:
>    - This patch allows XSK frames smaller than page size in the mlx5e
>      driver. Relax the requirements to the XSK frame size to allow it to be
>      smaller than a page and even not a power of two. The current
>      implementation can work in this mode, both with Striding RQ and without
>      it.
> 
> Patch 8:
>    - Add flags for umem configuration to libbpf. Since we increase the size
>      of the struct by adding flags, we also need to add the ABI versioning
>      in this patch.
> 
> Patch 9:
>    - Modify xdpsock application to add a command line option for
>      unaligned chunks
> 
> Patch 10:
>    - Since we can now run the application in unaligned chunk mode, we need
>      to make sure we recycle the buffers appropriately.
> 
> Patch 11:
>    - Adds hugepage support to the xdpsock application
> 
> Patch 12:
>    - Documentation update to include the unaligned chunk scenario. We need
>      to explicitly state that the incoming addresses are only masked in the
>      aligned chunk mode and not the unaligned chunk mode.
> 
> ---
> v2:
>    - fixed checkpatch issues
>    - fixed Rx buffer recycling for unaligned chunks in xdpsock
>    - removed unused defines
>    - fixed how chunk_size is calculated in xsk_diag.c
>    - added some performance numbers to cover letter
>    - modified descriptor format to make it easier to retrieve original
>      address
>    - removed patch adding off_t off to the zero copy allocator. This is no
>      longer needed with the new descriptor format.
> 
> v3:
>    - added patch for mlx5 driver changes needed for unaligned chunks
>    - moved offset handling to new helper function
>    - changed value used for the umem chunk_mask. Now using the new
>      descriptor format to save us doing the calculations in a number of
>      places meaning more of the code is left unchanged while adding
>      unaligned chunk support.
> 
> v4:
>    - reworked the next_pg_contig field in the xdp_umem_page struct. We now
>      use the low 12 bits of the addr for flags rather than adding an extra
>      field in the struct.
>    - modified unaligned chunks flag define
>    - fixed page_start calculation in __xsk_rcv_memcpy().
>    - move offset handling to the xdp_umem_get_* functions
>    - modified the len field in xdp_umem_reg struct. We now use 16 bits from
>      this for the flags field.
>    - fixed headroom addition to handle in the mlx5e driver
>    - other minor changes based on review comments
> 
> v5:
>    - Added ABI versioning in the libbpf patch
>    - Removed bitfields in the xdp_umem_reg struct. Adding new flags field.
>    - Added accessors for getting addr and offset.
>    - Added helper function for adding the offset to the addr.
>    - Fixed conflicts with 'bpf-af-xdp-wakeup' which was merged recently.
>    - Fixed typo in mlx driver patch.
>    - Moved libbpf patch to later in the set (7/11, just before the sample
>      app changes)
> 
> v6:
>    - Added support for XSK frames smaller than page in mlx5e driver (Maxim
>      Mikityanskiy <maximmi@...lanox.com).
>    - Fixed offset handling in xsk_generic_rcv.
>    - Added check for base address in xskq_is_valid_addr_unaligned.
> 
> Kevin Laatz (11):
>    i40e: simplify Rx buffer recycle
>    ixgbe: simplify Rx buffer recycle
>    xsk: add support to allow unaligned chunk placement
>    i40e: modify driver for handling offsets
>    ixgbe: modify driver for handling offsets
>    mlx5e: modify driver for handling offsets
>    libbpf: add flags to umem config
>    samples/bpf: add unaligned chunks mode support to xdpsock
>    samples/bpf: add buffer recycling for unaligned chunks to xdpsock
>    samples/bpf: use hugepages in xdpsock app
>    doc/af_xdp: include unaligned chunk case
> 
> Maxim Mikityanskiy (1):
>    net/mlx5e: Allow XSK frames smaller than a page
> 
>   Documentation/networking/af_xdp.rst           | 10 +-
>   drivers/net/ethernet/intel/i40e/i40e_xsk.c    | 26 +++--
>   drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c  | 26 +++--
>   .../ethernet/mellanox/mlx5/core/en/params.c   | 23 ++++-
>   .../ethernet/mellanox/mlx5/core/en/params.h   |  2 +
>   .../net/ethernet/mellanox/mlx5/core/en/xdp.c  |  8 +-
>   .../ethernet/mellanox/mlx5/core/en/xsk/rx.c   |  5 +-
>   .../mellanox/mlx5/core/en/xsk/setup.c         | 15 ++-
>   include/net/xdp_sock.h                        | 75 ++++++++++++++-
>   include/uapi/linux/if_xdp.h                   |  9 ++
>   net/xdp/xdp_umem.c                            | 19 +++-
>   net/xdp/xsk.c                                 | 94 +++++++++++++++----
>   net/xdp/xsk_diag.c                            |  2 +-
>   net/xdp/xsk_queue.h                           | 70 ++++++++++++--
>   samples/bpf/xdpsock_user.c                    | 61 ++++++++----
>   tools/include/uapi/linux/if_xdp.h             |  9 ++
>   tools/lib/bpf/Makefile                        |  5 +-
>   tools/lib/bpf/libbpf.map                      |  1 +
>   tools/lib/bpf/xsk.c                           | 33 ++++++-
>   tools/lib/bpf/xsk.h                           | 27 ++++++
>   20 files changed, 417 insertions(+), 103 deletions(-)
> 
Applied, thanks!
Powered by blists - more mailing lists
 
