[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABKxMyMj4+9r44GFO3PM15516Pkmr=g8WBVhdjv7tk231fGqWg@mail.gmail.com>
Date: Fri, 4 Aug 2023 12:16:05 +0800
From: 黄杰 <huangjie.albert@...edance.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
Jesper Dangaard Brouer <hawk@...nel.org>, John Fastabend <john.fastabend@...il.com>,
Björn Töpel <bjorn@...nel.org>,
Magnus Karlsson <magnus.karlsson@...el.com>,
Maciej Fijalkowski <maciej.fijalkowski@...el.com>, Jonathan Lemon <jonathan.lemon@...il.com>,
Pavel Begunkov <asml.silence@...il.com>, Yunsheng Lin <linyunsheng@...wei.com>,
Kees Cook <keescook@...omium.org>, Richard Gobert <richardbgobert@...il.com>,
"open list:NETWORKING DRIVERS" <netdev@...r.kernel.org>, open list <linux-kernel@...r.kernel.org>,
"open list:XDP (eXpress Data Path)" <bpf@...r.kernel.org>
Subject: Re: [External] Re: [RFC Optimizing veth xsk performance 00/10]
Paolo Abeni <pabeni@...hat.com> 于2023年8月3日周四 22:20写道:
>
> On Thu, 2023-08-03 at 22:04 +0800, huangjie.albert wrote:
> > AF_XDP is a kernel bypass technology that can greatly improve performance.
> > However, for virtual devices like veth, even with the use of AF_XDP sockets,
> > there are still many additional software paths that consume CPU resources.
> > This patch series focuses on optimizing the performance of AF_XDP sockets
> > for veth virtual devices. Patches 1 to 4 mainly involve preparatory work.
> > Patch 5 introduces tx queue and tx napi for packet transmission, while
> > patch 9 primarily implements zero-copy, and patch 10 adds support for
> > batch sending of IPv4 UDP packets. These optimizations significantly reduce
> > the software path and support checksum offload.
> >
> > I tested those feature with
> > A typical topology is shown below:
> > veth<-->veth-peer veth1-peer<--->veth1
> > 1 | | 7
> > |2 6|
> > | |
> > bridge<------->eth0(mlnx5)- switch -eth1(mlnx5)<--->bridge1
> > 3 4 5
> > (machine1) (machine2)
> > AF_XDP socket is attach to veth and veth1. and send packets to physical NIC(eth0)
> > veth:(172.17.0.2/24)
> > bridge:(172.17.0.1/24)
> > eth0:(192.168.156.66/24)
> >
> > eth1(172.17.0.2/24)
> > bridge1:(172.17.0.1/24)
> > eth0:(192.168.156.88/24)
> >
> > after set default route . snat . dnat. we can have a tests
> > to get the performance results.
> >
> > packets send from veth to veth1:
> > af_xdp test tool:
> > link:https://github.com/cclinuxer/libxudp
> > send:(veth)
> > ./objs/xudpperf send --dst 192.168.156.88:6002 -l 1300
> > recv:(veth1)
> > ./objs/xudpperf recv --src 172.17.0.2:6002
> >
> > udp test tool:iperf3
> > send:(veth)
> > iperf3 -c 192.168.156.88 -p 6002 -l 1300 -b 60G -u
>
> Should be: '-b 0' otherwise you will experience additional overhead.
>
with -b 0:
performance:
performance:(test weth libxdp lib)
UDP : 320 Kpps (with 100% cpu)
AF_XDP no zerocopy + no batch : 480 Kpps (with ksoftirqd 100% cpu)
AF_XDP with zerocopy + no batch : 540 Kpps (with ksoftirqd 100% cpu)
AF_XDP with batch + zerocopy : 1.5 Mpps (with ksoftirqd 15% cpu)
thanks.
> And you would likely pin processes and irqs to ensure BH and US run on
> different cores of the same numa node.
>
> Cheers,
>
> Paolo
>
Powered by blists - more mailing lists