[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <152717306303.4777.4205616217877503311.stgit@firesoul>
Date: Thu, 24 May 2018 16:45:41 +0200
From: Jesper Dangaard Brouer <brouer@...hat.com>
To: netdev@...r.kernel.org, Daniel Borkmann <borkmann@...earbox.net>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
Jesper Dangaard Brouer <brouer@...hat.com>
Cc: Christoph Hellwig <hch@...radead.org>,
BjörnTöpel <bjorn.topel@...el.com>,
John Fastabend <john.fastabend@...il.com>,
Magnus Karlsson <magnus.karlsson@...el.com>,
makita.toshiaki@....ntt.co.jp
Subject: [bpf-next V5 PATCH 0/8] xdp: introduce bulking for ndo_xdp_xmit API
This patchset change ndo_xdp_xmit API to take a bulk of xdp frames.
When kernel is compiled with CONFIG_RETPOLINE, every indirect function
pointer (branch) call hurts performance. For XDP this have a huge
negative performance impact.
This patchset reduce the needed (indirect) calls to ndo_xdp_xmit, but
also prepares for further optimizations. The DMA APIs use of indirect
function pointer calls is the primary source the regression. It is
left for a followup patchset, to use bulking calls towards the DMA API
(via the scatter-gatter calls).
The other advantage of this API change is that drivers can easier
amortize the cost of any sync/locking scheme, over the bulk of
packets. The assumption of the current API is that the driver
implemementing the NDO will also allocate a dedicated XDP TX queue for
every CPU in the system. Which is not always possible or practical to
configure. E.g. ixgbe cannot load an XDP program on a machine with
more than 96 CPUs, due to limited hardware TX queues. E.g. virtio_net
is hard to configure as it requires manually increasing the
queues. E.g. tun driver chooses to use a per XDP frame producer lock
modulo smp_processor_id over avail queues.
I'm considered adding 'flags' to ndo_xdp_xmit, but it's not part of
this patchset. This will be a followup patchset, once we know if this
will be needed (e.g. for non-map xdp_redirect flush-flag, and if
AF_XDP chooses to use ndo_xdp_xmit for TX).
---
V5: Fixed up issues spotted by Daniel and John
V4: Splitout the patches from 4 to 8 patches. I cannot split the
driver changes from the NDO change, but I've tried to isolated the NDO
change together with the driver change as much as possible.
Jesper Dangaard Brouer (8):
bpf: devmap introduce dev_map_enqueue
bpf: devmap prepare xdp frames for bulking
xdp: add tracepoint for devmap like cpumap have
samples/bpf: xdp_monitor use tracepoint xdp:xdp_devmap_xmit
xdp: introduce xdp_return_frame_rx_napi
xdp: change ndo_xdp_xmit API to support bulking
xdp/trace: extend tracepoint in devmap with an err
samples/bpf: xdp_monitor use err code from tracepoint xdp:xdp_devmap_xmit
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 26 ++++-
drivers/net/ethernet/intel/i40e/i40e_txrx.h | 2
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 21 +++-
drivers/net/tun.c | 37 +++++--
drivers/net/virtio_net.c | 66 +++++++++----
include/linux/bpf.h | 18 +++
include/linux/netdevice.h | 14 ++-
include/net/page_pool.h | 5 +
include/net/xdp.h | 1
include/trace/events/xdp.h | 50 +++++++++-
kernel/bpf/cpumap.c | 2
kernel/bpf/devmap.c | 131 +++++++++++++++++++++++--
net/core/filter.c | 23 +---
net/core/xdp.c | 20 +++-
samples/bpf/xdp_monitor_kern.c | 49 +++++++++
samples/bpf/xdp_monitor_user.c | 69 +++++++++++++
16 files changed, 448 insertions(+), 86 deletions(-)
--
Jesper Dangaard Brouer
Powered by blists - more mailing lists