lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 18 Dec 2018 10:36:35 -0800
From:   William Tu <u9012063@...il.com>
To:     Björn Töpel <bjorn.topel@...il.com>
Cc:     Magnus Karlsson <magnus.karlsson@...il.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Netdev <netdev@...r.kernel.org>, makita.toshiaki@....ntt.co.jp,
        "Karlsson, Magnus" <magnus.karlsson@...el.com>
Subject: Re: [bpf-next RFC 0/3] AF_XDP support for veth.

Thanks for the feedback.

On Tue, Dec 18, 2018 at 6:14 AM Björn Töpel <bjorn.topel@...il.com> wrote:
>
> Den mån 17 dec. 2018 kl 20:40 skrev William Tu <u9012063@...il.com>:
> >
> > The patch series adds AF_XDP async xmit support for veth device.
> > First patch add a new API for supporting non-physical NIC device to get
> > packet's virtual address.  The second patch implements the async xmit,
> > and last patch adds example use cases.
> >
>
> The first virtual device with AF_XDP support! Yay!
>
> This is only the zero-copy on the Tx side -- it's still allocations
> plus copy on the ingress side? That's a bit different from the
> i40e/ixgbe implementation, where zero-copy means both Tx and Rx. For
Right, it's a little different than i40e/ixgbe, which is physical nic.
For veth, the xmit is just placing the packet into peer device's rx queue.
Here, the veth af_xdp implementation is doing an extra copy from the
umem, create the packet, and triggers the receive code at peer device.

> veth I don't see that we need to support Rx right away, especially for
> Tx only sockets. Still, when the netdev has accepted the umem via
> ndo_bpf, the zero-copy for both Tx and Rx is assumed. We might want to
> change the ndo_bpf at some point to support zero-copy for Tx, Rx, Tx
> *and* Rx.
>
> Are you planning to add zero-copy to the ingress side, i.e. pulling
> frames from the fill ring, instead of allocating via dev_alloc_page?
> (The term *zero-copy* for veth is a bit weird, since we're still doing
> copies, but eliding the page allocation. :-))

Yes, I'm trying to remove this dev_alloc_page, still not successful.
Do you think we should go directly with the zero-copy version for next patch?

>
> It would be interesting to hear a bit about what use-case veth/AF_XDP
> has, if you can share that.
>
Yes, so we've been working on OVS + AF_XDP netdev support.
See OVS conference: Fast Userspace OVS with AF_XDP
http://www.openvswitch.org/support/ovscon2018/

AF_XDP from OVS's perspective is just a netdev doing packet I/O,
that is, a faster way to send and receive packets.
With i40e/ixgbe AF_XDP support, OVS can forward packets at very
high packet rate. However, users also attach virtual port to the OVS
bridge, for example, tap device connected to VM, or veth peer device
connected to container.  So packets flow from:
Physical NIC (with AF_XDP) --> OVS --> virtual port (no AF_XDP) --> VM/container

Since there is no AF_XDP support for virtual devices yet, the performance
drops significantly.  That's kind of the motivation for this patch
series, to add
virtual device support for AF_XDP.
Ultimately with I hope with AF_XDP support, a packet comes from
a physical nic, dma directly to umem, processed by OVS or others processing
software, zero-copy to tap/veth peer devices, then received by VM/container app.

Thanks.
William


>
> Cheers,
> Björn
>
> > I tested with 2 namespaces, one as sender, the other as receiver.
> > The packet rate is measure at the receiver side.
> >   ip netns add at_ns0
> >   ip link add p0 type veth peer name p1
> >   ip link set p0 netns at_ns0
> >   ip link set dev p1 up
> >   ip netns exec at_ns0 ip link set dev p0 up
> >
> >   # receiver
> >   ip netns exec at_ns0 xdp_rxq_info --dev p0 --action XDP_DROP
> >
> >   # sender with AF_XDP
> >   xdpsock -i p1 -t -N -z
> >
> >   # or sender without AF_XDP
> >   xdpsock -i p1 -t -S
> >
> > Without AF_XDP: 724 Kpps
> > RXQ stats       RXQ:CPU pps         issue-pps
> > rx_queue_index    0:1   724339      0
> > rx_queue_index    0:sum 724339
> >
> > With AF_XDP: 1.1 Mpps (with ksoftirqd 100% cpu)
> > RXQ stats       RXQ:CPU pps         issue-pps
> > rx_queue_index    0:3   1188181     0
> > rx_queue_index    0:sum 1188181
> >
> > William Tu (3):
> >   xsk: add xsk_umem_consume_tx_virtual.
> >   veth: support AF_XDP.
> >   samples: bpf: add veth AF_XDP example.
> >
> >  drivers/net/veth.c             | 247 ++++++++++++++++++++++++++++++++++++++++-
> >  include/net/xdp_sock.h         |   7 ++
> >  net/xdp/xsk.c                  |  24 ++++
> >  samples/bpf/test_veth_afxdp.sh |  67 +++++++++++
> >  4 files changed, 343 insertions(+), 2 deletions(-)
> >  create mode 100755 samples/bpf/test_veth_afxdp.sh
> >
> > --
> > 2.7.4
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ