[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ8uoz3WEfowgwXXdG3LYbNmJ3Y1CW8nkc=7pvzLvNdfWSCAsA@mail.gmail.com>
Date: Tue, 24 Apr 2018 10:44:04 +0200
From: Magnus Karlsson <magnus.karlsson@...il.com>
To: Jason Wang <jasowang@...hat.com>
Cc: Björn Töpel <bjorn.topel@...il.com>,
"Karlsson, Magnus" <magnus.karlsson@...el.com>,
Alexander Duyck <alexander.h.duyck@...el.com>,
Alexander Duyck <alexander.duyck@...il.com>,
John Fastabend <john.fastabend@...il.com>,
Alexei Starovoitov <ast@...com>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
"Michael S. Tsirkin" <mst@...hat.com>,
Network Development <netdev@...r.kernel.org>,
Björn Töpel <bjorn.topel@...el.com>,
michael.lundkvist@...csson.com,
"Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
"Singhai, Anjali" <anjali.singhai@...el.com>,
"Zhang, Qi Z" <qi.z.zhang@...el.com>
Subject: Re: [PATCH bpf-next 00/15] Introducing AF_XDP support
>> We have run some benchmarks on a dual socket system with two Broadwell
>> E5 2660 @ 2.0 GHz with hyperthreading turned off. Each socket has 14
>> cores which gives a total of 28, but only two cores are used in these
>> experiments. One for TR/RX and one for the user space application. The
>> memory is DDR4 @ 2133 MT/s (1067 MHz) and the size of each DIMM is
>> 8192MB and with 8 of those DIMMs in the system we have 64 GB of total
>> memory. The compiler used is gcc version 5.4.0 20160609. The NIC is an
>> Intel I40E 40Gbit/s using the i40e driver.
>>
>> Below are the results in Mpps of the I40E NIC benchmark runs for 64
>> and 1500 byte packets, generated by commercial packet generator HW that is
>> generating packets at full 40 Gbit/s line rate.
>>
>> AF_XDP performance 64 byte packets. Results from RFC V2 in parenthesis.
>> Benchmark XDP_SKB XDP_DRV
>> rxdrop 2.9(3.0) 9.4(9.3)
>> txpush 2.5(2.2) NA*
>> l2fwd 1.9(1.7) 2.4(2.4) (TX using XDP_SKB in both cases)
>
>
> This number looks not very exciting. I can get ~3Mpps when using testpmd in
> a guest with xdp_redirect.sh on host between ixgbe and TAP/vhost. I believe
> we can even better performance without virt. It would be interesting to
> compare this performance with e.g testpmd + virito_user(vhost_kernel) + XDP.
Note that all the XDP_SKB numbers plus the TX part of XDP_DRV for l2fwd
uses SKBs and the generic XDP path in the kernel. I am not surprised those
numbers are lower than what you are seeing with XDP_DRV support.
(If that is what you are running? Unsure about your setup). The
9.4 Mpps for RX is what you get with the XDP_DRV support and copies
out to user space. Or is it this number you think is low? Zerocopy will be added
in later patch sets.
With that said, both XDP_SKB and XDP_DRV can be optimized. We
have not spent that much time on optimizations at this point.
>
>>
>> AF_XDP performance 1500 byte packets:
>> Benchmark XDP_SKB XDP_DRV
>> rxdrop 2.1(2.2) 3.3(3.1)
>> l2fwd 1.4(1.1) 1.8(1.7) (TX using XDP_SKB in both cases)
>>
>> * NA since we have no support for TX using the XDP_DRV infrastructure
>> in this RFC. This is for a future patch set since it involves
>> changes to the XDP NDOs. Some of this has been upstreamed by Jesper
>> Dangaard Brouer.
>>
>> XDP performance on our system as a base line:
>>
>> 64 byte packets:
>> XDP stats CPU pps issue-pps
>> XDP-RX CPU 16 32,921,521 0
>>
>> 1500 byte packets:
>> XDP stats CPU pps issue-pps
>> XDP-RX CPU 16 3,289,491 0
>>
>> Changes from RFC V2:
>>
>> * Optimizations and simplifications to the ring structures inspired by
>> ptr_ring.h
>> * Renamed XDP_[RX|TX]_QUEUE to XDP_[RX|TX]_RING in the uapi to be
>> consistent with AF_PACKET
>> * Support for only having an RX queue or a TX queue defined
>> * Some bug fixes and code cleanup
>>
>> The structure of the patch set is as follows:
>>
>> Patches 1-2: Basic socket and umem plumbing
>> Patches 3-10: RX support together with the new XSKMAP
>> Patches 11-14: TX support
>> Patch 15: Sample application
>>
>> We based this patch set on bpf-next commit fbcf93ebcaef ("bpf: btf:
>> Clean up btf.h in uapi")
>>
>> Questions:
>>
>> * How to deal with cache alignment for uapi when different
>> architectures can have different cache line sizes? We have just
>> aligned it to 64 bytes for now, which works for many popular
>> architectures, but not all. Please advise.
>>
>> To do:
>>
>> * Optimize performance
>>
>> * Kernel selftest
>>
>> Post-series plan:
>>
>> * Kernel load module support of AF_XDP would be nice. Unclear how to
>> achieve this though since our XDP code depends on net/core.
>>
>> * Support for AF_XDP sockets without an XPD program loaded. In this
>> case all the traffic on a queue should go up to the user space socket.
>
>
> I think we probably need this in the case of TUN XDP for virt guest too.
Yes.
Thanks: Magnus
> Thanks
>
>
>>
>> * Daniel Borkmann's suggestion for a "copy to XDP socket, and return
>> XDP_PASS" for a tcpdump-like functionality.
>>
>> * And of course getting to zero-copy support in small increments.
>>
>> Thanks: Björn and Magnus
>>
>> Björn Töpel (8):
>> net: initial AF_XDP skeleton
>> xsk: add user memory registration support sockopt
>> xsk: add Rx queue setup and mmap support
>> xdp: introduce xdp_return_buff API
>> xsk: add Rx receive functions and poll support
>> bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP
>> xsk: wire up XDP_DRV side of AF_XDP
>> xsk: wire up XDP_SKB side of AF_XDP
>>
>> Magnus Karlsson (7):
>> xsk: add umem fill queue support and mmap
>> xsk: add support for bind for Rx
>> xsk: add umem completion queue support and mmap
>> xsk: add Tx queue setup and mmap support
>> xsk: support for Tx
>> xsk: statistics support
>> samples/bpf: sample application for AF_XDP sockets
>>
>> MAINTAINERS | 8 +
>> include/linux/bpf.h | 26 +
>> include/linux/bpf_types.h | 3 +
>> include/linux/filter.h | 2 +-
>> include/linux/socket.h | 5 +-
>> include/net/xdp.h | 1 +
>> include/net/xdp_sock.h | 46 ++
>> include/uapi/linux/bpf.h | 1 +
>> include/uapi/linux/if_xdp.h | 87 ++++
>> kernel/bpf/Makefile | 3 +
>> kernel/bpf/verifier.c | 8 +-
>> kernel/bpf/xskmap.c | 286 +++++++++++
>> net/Kconfig | 1 +
>> net/Makefile | 1 +
>> net/core/dev.c | 34 +-
>> net/core/filter.c | 40 +-
>> net/core/sock.c | 12 +-
>> net/core/xdp.c | 15 +-
>> net/xdp/Kconfig | 7 +
>> net/xdp/Makefile | 2 +
>> net/xdp/xdp_umem.c | 256 ++++++++++
>> net/xdp/xdp_umem.h | 65 +++
>> net/xdp/xdp_umem_props.h | 23 +
>> net/xdp/xsk.c | 704 +++++++++++++++++++++++++++
>> net/xdp/xsk_queue.c | 73 +++
>> net/xdp/xsk_queue.h | 245 ++++++++++
>> samples/bpf/Makefile | 4 +
>> samples/bpf/xdpsock.h | 11 +
>> samples/bpf/xdpsock_kern.c | 56 +++
>> samples/bpf/xdpsock_user.c | 947
>> ++++++++++++++++++++++++++++++++++++
>> security/selinux/hooks.c | 4 +-
>> security/selinux/include/classmap.h | 4 +-
>> 32 files changed, 2945 insertions(+), 35 deletions(-)
>> create mode 100644 include/net/xdp_sock.h
>> create mode 100644 include/uapi/linux/if_xdp.h
>> create mode 100644 kernel/bpf/xskmap.c
>> create mode 100644 net/xdp/Kconfig
>> create mode 100644 net/xdp/Makefile
>> create mode 100644 net/xdp/xdp_umem.c
>> create mode 100644 net/xdp/xdp_umem.h
>> create mode 100644 net/xdp/xdp_umem_props.h
>> create mode 100644 net/xdp/xsk.c
>> create mode 100644 net/xdp/xsk_queue.c
>> create mode 100644 net/xdp/xsk_queue.h
>> create mode 100644 samples/bpf/xdpsock.h
>> create mode 100644 samples/bpf/xdpsock_kern.c
>> create mode 100644 samples/bpf/xdpsock_user.c
>>
>
Powered by blists - more mailing lists