[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230807120434.83644-1-huangjie.albert@bytedance.com>
Date: Mon, 7 Aug 2023 20:04:20 +0800
From: Albert Huang <huangjie.albert@...edance.com>
To:
Cc: Albert Huang <huangjie.albert@...edance.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Björn Töpel <bjorn@...nel.org>,
Magnus Karlsson <magnus.karlsson@...el.com>,
Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
Jonathan Lemon <jonathan.lemon@...il.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Jesper Dangaard Brouer <hawk@...nel.org>,
John Fastabend <john.fastabend@...il.com>,
Pavel Begunkov <asml.silence@...il.com>,
Richard Gobert <richardbgobert@...il.com>,
Menglong Dong <imagedong@...cent.com>,
Yunsheng Lin <linyunsheng@...wei.com>,
netdev@...r.kernel.org (open list:NETWORKING DRIVERS),
linux-kernel@...r.kernel.org (open list),
bpf@...r.kernel.org (open list:XDP SOCKETS (AF_XDP))
Subject: [RFC v2 Optimizing veth xsk performance 0/9]
AF_XDP is a kernel bypass technology that can greatly improve performance.
However, for virtual devices like veth, even with the use of AF_XDP sockets,
there are still many additional software paths that consume CPU resources.
This patch series focuses on optimizing the performance of AF_XDP sockets
for veth virtual devices. Patches 1 to 4 mainly involve preparatory work.
Patch 5 introduces tx queue and tx napi for packet transmission, while
patch 8 primarily implements batch sending for IPv4 UDP packets, and patch 9
add support for AF_XDP tx need_wakup feature. These optimizations significantly
reduce the software path and support checksum offload.
I tested those feature with
A typical topology is shown below:
client(send): server:(recv)
veth<-->veth-peer veth1-peer<--->veth1
1 | | 7
|2 6|
| |
bridge<------->eth0(mlnx5)- switch -eth1(mlnx5)<--->bridge1
3 4 5
(machine1) (machine2)
AF_XDP socket is attach to veth and veth1. and send packets to physical NIC(eth0)
veth:(172.17.0.2/24)
bridge:(172.17.0.1/24)
eth0:(192.168.156.66/24)
eth1(172.17.0.2/24)
bridge1:(172.17.0.1/24)
eth0:(192.168.156.88/24)
after set default route、snat、dnat. we can have a tests
to get the performance results.
packets send from veth to veth1:
af_xdp test tool:
link:https://github.com/cclinuxer/libxudp
send:(veth)
./objs/xudpperf send --dst 192.168.156.88:6002 -l 1300
recv:(veth1)
./objs/xudpperf recv --src 172.17.0.2:6002
udp test tool:iperf3
send:(veth)
iperf3 -c 192.168.156.88 -p 6002 -l 1300 -b 0 -u
recv:(veth1)
iperf3 -s -p 6002
performance:
performance:(test weth libxudp lib)
UDP : 320 Kpps (with 100% cpu)
AF_XDP no zerocopy + no batch : 480 Kpps (with ksoftirqd 100% cpu)
AF_XDP with batch + zerocopy : 1.5 Mpps (with ksoftirqd 15% cpu)
With af_xdp batch, the libxudp user-space program reaches a bottleneck.
Therefore, the softirq did not reach the limit.
This is just an RFC patch series, and some code details still need
further consideration. Please review this proposal.
thanks!
v1->v2:
- all the patches pass checkpatch.pl test. suggested by Simon Horman.
- iperf3 tested with -b 0, update the test results. suggested by Paolo Abeni.
- refactor code to make code structure clearer.
- delete some useless code logic in the veth_xsk_tx_xmit function.
- add support for AF_XDP tx need_wakup feature.
Albert Huang (9):
veth: Implement ethtool's get_ringparam() callback
xsk: add dma_check_skip for skipping dma check
veth: add support for send queue
xsk: add xsk_tx_completed_addr function
veth: use send queue tx napi to xmit xsk tx desc
veth: add ndo_xsk_wakeup callback for veth
sk_buff: add destructor_arg_xsk_pool for zero copy
veth: af_xdp tx batch support for ipv4 udp
veth: add support for AF_XDP tx need_wakup feature
drivers/net/veth.c | 679 +++++++++++++++++++++++++++++++++++-
include/linux/skbuff.h | 2 +
include/net/xdp_sock_drv.h | 1 +
include/net/xsk_buff_pool.h | 1 +
net/xdp/xsk.c | 6 +
net/xdp/xsk_buff_pool.c | 3 +-
net/xdp/xsk_queue.h | 10 +
7 files changed, 700 insertions(+), 2 deletions(-)
--
2.20.1
Powered by blists - more mailing lists