[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180829144446.72509a96@redhat.com>
Date: Wed, 29 Aug 2018 14:44:46 +0200
From: Jesper Dangaard Brouer <brouer@...hat.com>
To: Björn Töpel <bjorn.topel@...il.com>
Cc: magnus.karlsson@...el.com, magnus.karlsson@...il.com,
alexander.h.duyck@...el.com, alexander.duyck@...il.com,
ast@...nel.org, daniel@...earbox.net, netdev@...r.kernel.org,
jesse.brandeburg@...el.com, anjali.singhai@...el.com,
peter.waskiewicz.jr@...el.com,
Björn Töpel
<bjorn.topel@...el.com>, michael.lundkvist@...csson.com,
willemdebruijn.kernel@...il.com, john.fastabend@...il.com,
jakub.kicinski@...ronome.com, neerav.parikh@...el.com,
mykyta.iziumtsev@...aro.org, francois.ozog@...aro.org,
ilias.apalodimas@...aro.org, brian.brooks@...aro.org,
u9012063@...il.com, pavel@...tnetmon.com, qi.z.zhang@...el.com,
brouer@...hat.com
Subject: Re: [PATCH bpf-next 11/11] samples/bpf: add -c/--copy
-z/--zero-copy flags to xdpsock
On Tue, 28 Aug 2018 14:44:35 +0200
Björn Töpel <bjorn.topel@...il.com> wrote:
> From: Björn Töpel <bjorn.topel@...el.com>
>
> The -c/--copy -z/--zero-copy flags enforces either copy or zero-copy
> mode.
Nice, thanks for adding this. It allows me to quickly test the
difference between normal-copy vs zero-copy modes.
(Kernel bpf-next without RETPOLINE).
AF_XDP RX-drop:
Normal-copy mode: rx 13,070,318 pps - 76.5 ns
Zero-copy mode: rx 26,132,328 pps - 38.3 ns
Compare to XDP_DROP: 34,251,464 pps - 29.2 ns
XDP_DROP + read : 30,756,664 pps - 32.5 ns
The normal-copy mode is surprisingly fast (and it works for every
driver implemeting the regular XDP_REDIRECT action). It is still
faster to do in-kernel XDP_DROP than AF_XDP zero-copy mode dropping,
which was expected given frames travel to a remote CPU before returned
(don't think remote CPU reads payload?). The gap in nanosec is
actually quite small, thus I'm impressed by the SPSC-queue
implementation working across these CPUs.
AF_XDP layer2-fwd:
Normal-copy mode: rx 3,200,885 tx 3,200,892
Zero-copy mode: rx 17,026,300 tx 17,026,269
Compare to XDP_TX: rx 14,529,079 tx 14,529,850 - 68.82 ns
XDP_REDIRECT: rx 13,235,785 tx 13,235,784 - 75.55 ns
The copy-mode is slow because it allocates SKBs internally (I do
wonder if we could speed it up by using ndo_xdp_xmit + disable-BH).
More intersting is that the zero-copy is faster than XDP_TX and
XDP_REDIRECT. I think the speedup comes from avoiding some DMA mapping
calls with ZC.
Side-note: XDP_TX vs. REDIRECT: 75.55 - 68.82 = 6.73 ns. The cost of
going through the xdp_do_redirect_map core is actually quite small :-)
(I have some micro optimizations that should help ~2ns).
AF_XDP TX-only:
Normal-copy mode: tx 2,853,461 pps
Zero-copy mode: tx 22,255,311 pps
(There is not XDP mode that does TX to compare against)
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
Powered by blists - more mailing lists