netdev - Re: [External] Re: [RFC Optimizing veth xsk performance 00/10]

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABKxMyMj4+9r44GFO3PM15516Pkmr=g8WBVhdjv7tk231fGqWg@mail.gmail.com>
Date: Fri, 4 Aug 2023 12:16:05 +0800
From: 黄杰 <huangjie.albert@...edance.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, 
	Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, 
	Jesper Dangaard Brouer <hawk@...nel.org>, John Fastabend <john.fastabend@...il.com>, 
	Björn Töpel <bjorn@...nel.org>, 
	Magnus Karlsson <magnus.karlsson@...el.com>, 
	Maciej Fijalkowski <maciej.fijalkowski@...el.com>, Jonathan Lemon <jonathan.lemon@...il.com>, 
	Pavel Begunkov <asml.silence@...il.com>, Yunsheng Lin <linyunsheng@...wei.com>, 
	Kees Cook <keescook@...omium.org>, Richard Gobert <richardbgobert@...il.com>, 
	"open list:NETWORKING DRIVERS" <netdev@...r.kernel.org>, open list <linux-kernel@...r.kernel.org>, 
	"open list:XDP (eXpress Data Path)" <bpf@...r.kernel.org>
Subject: Re: [External] Re: [RFC Optimizing veth xsk performance 00/10]

Paolo Abeni <pabeni@...hat.com> 于2023年8月3日周四 22:20写道：
>
> On Thu, 2023-08-03 at 22:04 +0800, huangjie.albert wrote:
> > AF_XDP is a kernel bypass technology that can greatly improve performance.
> > However, for virtual devices like veth, even with the use of AF_XDP sockets,
> > there are still many additional software paths that consume CPU resources.
> > This patch series focuses on optimizing the performance of AF_XDP sockets
> > for veth virtual devices. Patches 1 to 4 mainly involve preparatory work.
> > Patch 5 introduces tx queue and tx napi for packet transmission, while
> > patch 9 primarily implements zero-copy, and patch 10 adds support for
> > batch sending of IPv4 UDP packets. These optimizations significantly reduce
> > the software path and support checksum offload.
> >
> > I tested those feature with
> > A typical topology is shown below:
> > veth<-->veth-peer                                    veth1-peer<--->veth1
> >       1       |                                                  |   7
> >               |2                                                6|
> >               |                                                  |
> >             bridge<------->eth0(mlnx5)- switch -eth1(mlnx5)<--->bridge1
> >                   3                    4                 5
> >              (machine1)                              (machine2)
> > AF_XDP socket is attach to veth and veth1. and send packets to physical NIC(eth0)
> > veth:(172.17.0.2/24)
> > bridge:(172.17.0.1/24)
> > eth0:(192.168.156.66/24)
> >
> > eth1(172.17.0.2/24)
> > bridge1:(172.17.0.1/24)
> > eth0:(192.168.156.88/24)
> >
> > after set default route . snat . dnat. we can have a tests
> > to get the performance results.
> >
> > packets send from veth to veth1:
> > af_xdp test tool:
> > link:https://github.com/cclinuxer/libxudp
> > send:(veth)
> > ./objs/xudpperf send --dst 192.168.156.88:6002 -l 1300
> > recv:(veth1)
> > ./objs/xudpperf recv --src 172.17.0.2:6002
> >
> > udp test tool:iperf3
> > send:(veth)
> > iperf3 -c 192.168.156.88 -p 6002 -l 1300 -b 60G -u
>
> Should be: '-b 0' otherwise you will experience additional overhead.
>

with -b 0:
performance:
performance:(test weth libxdp lib)
UDP                              : 320 Kpps (with 100% cpu)
AF_XDP   no  zerocopy + no batch : 480 Kpps (with ksoftirqd 100% cpu)
AF_XDP  with zerocopy + no batch : 540 Kpps (with ksoftirqd 100% cpu)
AF_XDP  with  batch  +  zerocopy : 1.5 Mpps (with ksoftirqd 15% cpu)

thanks.

> And you would likely pin processes and irqs to ensure BH and US run on
> different cores of the same numa node.
>
> Cheers,
>
> Paolo
>