lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ8uoz3WEfowgwXXdG3LYbNmJ3Y1CW8nkc=7pvzLvNdfWSCAsA@mail.gmail.com>
Date:   Tue, 24 Apr 2018 10:44:04 +0200
From:   Magnus Karlsson <magnus.karlsson@...il.com>
To:     Jason Wang <jasowang@...hat.com>
Cc:     Björn Töpel <bjorn.topel@...il.com>,
        "Karlsson, Magnus" <magnus.karlsson@...el.com>,
        Alexander Duyck <alexander.h.duyck@...el.com>,
        Alexander Duyck <alexander.duyck@...il.com>,
        John Fastabend <john.fastabend@...il.com>,
        Alexei Starovoitov <ast@...com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Willem de Bruijn <willemdebruijn.kernel@...il.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Network Development <netdev@...r.kernel.org>,
        Björn Töpel <bjorn.topel@...el.com>,
        michael.lundkvist@...csson.com,
        "Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
        "Singhai, Anjali" <anjali.singhai@...el.com>,
        "Zhang, Qi Z" <qi.z.zhang@...el.com>
Subject: Re: [PATCH bpf-next 00/15] Introducing AF_XDP support

>> We have run some benchmarks on a dual socket system with two Broadwell
>> E5 2660 @ 2.0 GHz with hyperthreading turned off. Each socket has 14
>> cores which gives a total of 28, but only two cores are used in these
>> experiments. One for TR/RX and one for the user space application. The
>> memory is DDR4 @ 2133 MT/s (1067 MHz) and the size of each DIMM is
>> 8192MB and with 8 of those DIMMs in the system we have 64 GB of total
>> memory. The compiler used is gcc version 5.4.0 20160609. The NIC is an
>> Intel I40E 40Gbit/s using the i40e driver.
>>
>> Below are the results in Mpps of the I40E NIC benchmark runs for 64
>> and 1500 byte packets, generated by commercial packet generator HW that is
>> generating packets at full 40 Gbit/s line rate.
>>
>> AF_XDP performance 64 byte packets. Results from RFC V2 in parenthesis.
>> Benchmark   XDP_SKB   XDP_DRV
>> rxdrop       2.9(3.0)   9.4(9.3)
>> txpush       2.5(2.2)   NA*
>> l2fwd        1.9(1.7)   2.4(2.4) (TX using XDP_SKB in both cases)
>
>
> This number looks not very exciting. I can get ~3Mpps when using testpmd in
> a guest with xdp_redirect.sh on host between ixgbe and TAP/vhost. I believe
> we can even better performance without virt. It would be interesting to
> compare this performance with e.g testpmd + virito_user(vhost_kernel) + XDP.

Note that all the XDP_SKB numbers plus the TX part of XDP_DRV for l2fwd
uses SKBs and the generic XDP path in the kernel. I am not surprised those
numbers are lower than what you are seeing with XDP_DRV support.
(If that is what you are running? Unsure about your setup). The
9.4 Mpps for RX is what you get with the XDP_DRV support and copies
out to user space. Or is it this number you think is low? Zerocopy will be added
in later patch sets.

With that said, both XDP_SKB and XDP_DRV can be optimized. We
have not spent that much time on optimizations at this point.

>
>>
>> AF_XDP performance 1500 byte packets:
>> Benchmark   XDP_SKB   XDP_DRV
>> rxdrop       2.1(2.2)   3.3(3.1)
>> l2fwd        1.4(1.1)   1.8(1.7) (TX using XDP_SKB in both cases)
>>
>> * NA since we have no support for TX using the XDP_DRV infrastructure
>>    in this RFC. This is for a future patch set since it involves
>>    changes to the XDP NDOs. Some of this has been upstreamed by Jesper
>>    Dangaard Brouer.
>>
>> XDP performance on our system as a base line:
>>
>> 64 byte packets:
>> XDP stats       CPU     pps         issue-pps
>> XDP-RX CPU      16      32,921,521  0
>>
>> 1500 byte packets:
>> XDP stats       CPU     pps         issue-pps
>> XDP-RX CPU      16      3,289,491   0
>>
>> Changes from RFC V2:
>>
>> * Optimizations and simplifications to the ring structures inspired by
>>    ptr_ring.h
>> * Renamed XDP_[RX|TX]_QUEUE to XDP_[RX|TX]_RING in the uapi to be
>>    consistent with AF_PACKET
>> * Support for only having an RX queue or a TX queue defined
>> * Some bug fixes and code cleanup
>>
>> The structure of the patch set is as follows:
>>
>> Patches 1-2: Basic socket and umem plumbing
>> Patches 3-10: RX support together with the new XSKMAP
>> Patches 11-14: TX support
>> Patch 15: Sample application
>>
>> We based this patch set on bpf-next commit fbcf93ebcaef ("bpf: btf:
>> Clean up btf.h in uapi")
>>
>> Questions:
>>
>> * How to deal with cache alignment for uapi when different
>>    architectures can have different cache line sizes? We have just
>>    aligned it to 64 bytes for now, which works for many popular
>>    architectures, but not all. Please advise.
>>
>> To do:
>>
>> * Optimize performance
>>
>> * Kernel selftest
>>
>> Post-series plan:
>>
>> * Kernel load module support of AF_XDP would be nice. Unclear how to
>>    achieve this though since our XDP code depends on net/core.
>>
>> * Support for AF_XDP sockets without an XPD program loaded. In this
>>    case all the traffic on a queue should go up to the user space socket.
>
>
> I think we probably need this in the case of TUN XDP for virt guest too.

Yes.

Thanks: Magnus

> Thanks
>
>
>>
>> * Daniel Borkmann's suggestion for a "copy to XDP socket, and return
>>    XDP_PASS" for a tcpdump-like functionality.
>>
>> * And of course getting to zero-copy support in small increments.
>>
>> Thanks: Björn and Magnus
>>
>> Björn Töpel (8):
>>    net: initial AF_XDP skeleton
>>    xsk: add user memory registration support sockopt
>>    xsk: add Rx queue setup and mmap support
>>    xdp: introduce xdp_return_buff API
>>    xsk: add Rx receive functions and poll support
>>    bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP
>>    xsk: wire up XDP_DRV side of AF_XDP
>>    xsk: wire up XDP_SKB side of AF_XDP
>>
>> Magnus Karlsson (7):
>>    xsk: add umem fill queue support and mmap
>>    xsk: add support for bind for Rx
>>    xsk: add umem completion queue support and mmap
>>    xsk: add Tx queue setup and mmap support
>>    xsk: support for Tx
>>    xsk: statistics support
>>    samples/bpf: sample application for AF_XDP sockets
>>
>>   MAINTAINERS                         |   8 +
>>   include/linux/bpf.h                 |  26 +
>>   include/linux/bpf_types.h           |   3 +
>>   include/linux/filter.h              |   2 +-
>>   include/linux/socket.h              |   5 +-
>>   include/net/xdp.h                   |   1 +
>>   include/net/xdp_sock.h              |  46 ++
>>   include/uapi/linux/bpf.h            |   1 +
>>   include/uapi/linux/if_xdp.h         |  87 ++++
>>   kernel/bpf/Makefile                 |   3 +
>>   kernel/bpf/verifier.c               |   8 +-
>>   kernel/bpf/xskmap.c                 | 286 +++++++++++
>>   net/Kconfig                         |   1 +
>>   net/Makefile                        |   1 +
>>   net/core/dev.c                      |  34 +-
>>   net/core/filter.c                   |  40 +-
>>   net/core/sock.c                     |  12 +-
>>   net/core/xdp.c                      |  15 +-
>>   net/xdp/Kconfig                     |   7 +
>>   net/xdp/Makefile                    |   2 +
>>   net/xdp/xdp_umem.c                  | 256 ++++++++++
>>   net/xdp/xdp_umem.h                  |  65 +++
>>   net/xdp/xdp_umem_props.h            |  23 +
>>   net/xdp/xsk.c                       | 704 +++++++++++++++++++++++++++
>>   net/xdp/xsk_queue.c                 |  73 +++
>>   net/xdp/xsk_queue.h                 | 245 ++++++++++
>>   samples/bpf/Makefile                |   4 +
>>   samples/bpf/xdpsock.h               |  11 +
>>   samples/bpf/xdpsock_kern.c          |  56 +++
>>   samples/bpf/xdpsock_user.c          | 947
>> ++++++++++++++++++++++++++++++++++++
>>   security/selinux/hooks.c            |   4 +-
>>   security/selinux/include/classmap.h |   4 +-
>>   32 files changed, 2945 insertions(+), 35 deletions(-)
>>   create mode 100644 include/net/xdp_sock.h
>>   create mode 100644 include/uapi/linux/if_xdp.h
>>   create mode 100644 kernel/bpf/xskmap.c
>>   create mode 100644 net/xdp/Kconfig
>>   create mode 100644 net/xdp/Makefile
>>   create mode 100644 net/xdp/xdp_umem.c
>>   create mode 100644 net/xdp/xdp_umem.h
>>   create mode 100644 net/xdp/xdp_umem_props.h
>>   create mode 100644 net/xdp/xsk.c
>>   create mode 100644 net/xdp/xsk_queue.c
>>   create mode 100644 net/xdp/xsk_queue.h
>>   create mode 100644 samples/bpf/xdpsock.h
>>   create mode 100644 samples/bpf/xdpsock_kern.c
>>   create mode 100644 samples/bpf/xdpsock_user.c
>>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ