lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <74408d8f-bb05-2d0d-9d4b-ea9b0e17fc5e@iogearbox.net>
Date:   Wed, 23 May 2018 11:24:14 +0200
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Jesper Dangaard Brouer <brouer@...hat.com>, netdev@...r.kernel.org,
        Daniel Borkmann <borkmann@...earbox.net>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Christoph Hellwig <hch@...radead.org>,
        BjörnTöpel <bjorn.topel@...el.com>,
        Magnus Karlsson <magnus.karlsson@...el.com>,
        makita.toshiaki@....ntt.co.jp
Subject: Re: [bpf-next V4 PATCH 0/8] xdp: introduce bulking for ndo_xdp_xmit
 API

On 05/18/2018 03:34 PM, Jesper Dangaard Brouer wrote:
> This patchset change ndo_xdp_xmit API to take a bulk of xdp frames.
> 
> In this V4 patchset, I've split-out the patches from 4 to 8 patches.
> I cannot split the driver changes from the NDO change, but I've tried
> to isolated the NDO change together with the driver change as much as
> possible.
> 
> When kernel is compiled with CONFIG_RETPOLINE, every indirect function
> pointer (branch) call hurts performance. For XDP this have a huge
> negative performance impact.
> 
> This patchset reduce the needed (indirect) calls to ndo_xdp_xmit, but
> also prepares for further optimizations.  The DMA APIs use of indirect
> function pointer calls is the primary source the regression.  It is
> left for a followup patchset, to use bulking calls towards the DMA API
> (via the scatter-gatter calls).
> 
> The other advantage of this API change is that drivers can easier
> amortize the cost of any sync/locking scheme, over the bulk of
> packets.  The assumption of the current API is that the driver
> implemementing the NDO will also allocate a dedicated XDP TX queue for
> every CPU in the system.  Which is not always possible or practical to
> configure. E.g. ixgbe cannot load an XDP program on a machine with
> more than 96 CPUs, due to limited hardware TX queues.  E.g. virtio_net
> is hard to configure as it requires manually increasing the
> queues. E.g. tun driver chooses to use a per XDP frame producer lock
> modulo smp_processor_id over avail queues.
> 
> I'm considered adding 'flags' to ndo_xdp_xmit, but it's not part of
> this patchset.  This will be a followup patchset, once we know if this
> will be needed (e.g. for non-map xdp_redirect flush-flag, and if
> AF_XDP chooses to use ndo_xdp_xmit for TX).
> 
> ---
> 
> Jesper Dangaard Brouer (8):
>       bpf: devmap introduce dev_map_enqueue
>       bpf: devmap prepare xdp frames for bulking
>       xdp: add tracepoint for devmap like cpumap have
>       samples/bpf: xdp_monitor use tracepoint xdp:xdp_devmap_xmit
>       xdp: introduce xdp_return_frame_rx_napi
>       xdp: change ndo_xdp_xmit API to support bulking
>       xdp/trace: extend tracepoint in devmap with an err
>       samples/bpf: xdp_monitor use err code from tracepoint xdp:xdp_devmap_xmit

Series applied to bpf-next, thanks Jesper. (Some minor comments in the patches.)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ