[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <151396262289.20006.1429172971820409456.stgit@firesoul>
Date: Fri, 22 Dec 2017 18:11:34 +0100
From: Jesper Dangaard Brouer <brouer@...hat.com>
To: Daniel Borkmann <borkmann@...earbox.net>,
Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: netdev@...r.kernel.org, Jesper Dangaard Brouer <brouer@...hat.com>,
dsahern@...il.com, gospo@...adcom.com, bjorn.topel@...el.com,
michael.chan@...adcom.com
Subject: [bpf-next V2 PATCH 00/14] xdp: new XDP rx-queue info concept
V2:
* Changed API exposed to drivers
- Removed invocation of "init" in drivers, and only call "reg"
(Suggested by Saeed)
- Allow "reg" to fail and handle this in drivers
(Suggested by David Ahern)
* Removed the SINKQ qtype, instead allow to register as "unused"
* Also fixed some drivers during testing on actual HW (noted in patches)
There is a need for XDP to know more about the RX-queue a given XDP
frames have arrived on. For both the XDP bpf-prog and kernel side.
Instead of extending struct xdp_buff each time new info is needed,
this patchset takes a different approach. Struct xdp_buff is only
extended with a pointer to a struct xdp_rxq_info (allowing for easier
extending this later). This xdp_rxq_info contains information related
to how the driver have setup the individual RX-queue's. This is
read-mostly information, and all xdp_buff frames (in drivers
napi_poll) point to the same xdp_rxq_info (per RX-queue).
We stress this data/cache-line is for read-mostly info. This is NOT
for dynamic per packet info, use the data_meta for such use-cases.
This patchset start out small, and only expose ingress_ifindex and the
RX-queue index to the XDP/BPF program. Access to tangible info like
the ingress ifindex and RX queue index, is fairly easy to comprehent.
The other future use-cases could allow XDP frames to be recycled back
to the originating device driver, by providing info on RX device and
queue number.
As XDP doesn't have driver feature flags, and eBPF code due to
bpf-tail-calls cannot determine that XDP driver invoke it, this
patchset have to update every driver that support XDP.
For driver developers (review individual driver patches!):
The xdp_rxq_info is tied to the drivers RX-ring(s). Whenever a RX-ring
modification require (temporary) stopping RX frames, then the
xdp_rxq_info should (likely) also be unregistred and re-registered,
especially if reallocating the pages in the ring. Make sure ethtool
set_channels does the right thing. When replacing XDP prog, if and
only if RX-ring need to be changed, then also re-register the
xdp_rxq_info.
I'm Cc'ing the individual driver patches to the registered maintainers.
Testing:
I've only tested the NIC drivers I have hardware for. The general
test procedure is to (DUT = Device Under Test):
(1) run pktgen script pktgen_sample04_many_flows.sh (against DUT)
(2) run samples/bpf program xdp_rxq_info --dev $DEV (on DUT)
(3) runtime modify number of NIC queues via ethtool -L (on DUT)
(4) runtime modify number of NIC ring-size via ethtool -G (on DUT)
Patch based on git tree bpf-next (at commit c475ffad58a8a2):
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/
---
Jesper Dangaard Brouer (14):
xdp: base API for new XDP rx-queue info concept
xdp/mlx5: setup xdp_rxq_info
i40e: setup xdp_rxq_info
ixgbe: setup xdp_rxq_info
xdp/qede: setup xdp_rxq_info and intro xdp_rxq_info_is_reg
mlx4: setup xdp_rxq_info
bnxt_en: setup xdp_rxq_info
nfp: setup xdp_rxq_info
thunderx: setup xdp_rxq_info
tun: setup xdp_rxq_info
virtio_net: setup xdp_rxq_info
xdp: generic XDP handling of xdp_rxq_info
bpf: finally expose xdp_rxq_info to XDP bpf-programs
samples/bpf: program demonstrating access to xdp_rxq_info
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 10
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2
drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 1
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 11
drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 4
drivers/net/ethernet/cavium/thunder/nicvf_queues.h | 2
drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 2
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 18 +
drivers/net/ethernet/intel/i40e/i40e_txrx.h | 3
drivers/net/ethernet/intel/ixgbe/ixgbe.h | 2
drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 4
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 10
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 3
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 13
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 4
drivers/net/ethernet/mellanox/mlx5/core/en.h | 4
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 9
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 1
drivers/net/ethernet/netronome/nfp/nfp_net.h | 5
.../net/ethernet/netronome/nfp/nfp_net_common.c | 10
drivers/net/ethernet/qlogic/qede/qede.h | 2
drivers/net/ethernet/qlogic/qede/qede_fp.c | 1
drivers/net/ethernet/qlogic/qede/qede_main.c | 10
drivers/net/tun.c | 24 +
drivers/net/virtio_net.c | 12
include/linux/filter.h | 2
include/linux/netdevice.h | 2
include/net/xdp.h | 48 ++
include/uapi/linux/bpf.h | 3
net/core/Makefile | 2
net/core/dev.c | 69 ++-
net/core/filter.c | 19 +
net/core/xdp.c | 74 +++
samples/bpf/Makefile | 4
samples/bpf/xdp_rxq_info_kern.c | 96 ++++
samples/bpf/xdp_rxq_info_user.c | 532 ++++++++++++++++++++
36 files changed, 990 insertions(+), 28 deletions(-)
create mode 100644 include/net/xdp.h
create mode 100644 net/core/xdp.c
create mode 100644 samples/bpf/xdp_rxq_info_kern.c
create mode 100644 samples/bpf/xdp_rxq_info_user.c
--
Powered by blists - more mailing lists