[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACGkMEuvubWfg8Wc+=eNqg1rHR+PD6jsH7_QEJV6=S+DUVdThQ@mail.gmail.com>
Date: Thu, 25 Jan 2024 11:39:28 +0800
From: Jason Wang <jasowang@...hat.com>
To: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Cc: netdev@...r.kernel.org, "Michael S. Tsirkin" <mst@...hat.com>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, Jesper Dangaard Brouer <hawk@...nel.org>,
John Fastabend <john.fastabend@...il.com>, virtualization@...ts.linux.dev,
bpf@...r.kernel.org
Subject: Re: [PATCH net-next 0/5] virtio-net: sq support premapped mode
On Tue, Jan 16, 2024 at 3:59 PM Xuan Zhuo <xuanzhuo@...ux.alibaba.com> wrote:
>
> This is the second part of virtio-net support AF_XDP zero copy.
>
> The whole patch set
> http://lore.kernel.org/all/20231229073108.57778-1-xuanzhuo@linux.alibaba.com
>
> ## About the branch
>
> This patch set is pushed to the net-next branch, but some patches are about
> virtio core. Because the entire patch set for virtio-net to support AF_XDP
> should be pushed to net-next, I hope these patches will be merged into net-next
> with the virtio core maintains's Acked-by.
>
> ============================================================================
>
> ## AF_XDP
>
> XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> copy feature of xsk (XDP socket) needs to be supported by the driver. The
> performance of zero copy is very good. mlx5 and intel ixgbe already support
> this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> feature.
>
> At present, we have completed some preparation:
>
> 1. vq-reset (virtio spec and kernel code)
> 2. virtio-core premapped dma
> 3. virtio-net xdp refactor
>
> So it is time for Virtio-Net to complete the support for the XDP Socket
> Zerocopy.
>
> Virtio-net can not increase the queue num at will, so xsk shares the queue with
> kernel.
>
> On the other hand, Virtio-Net does not support generate interrupt from driver
> manually, so when we wakeup tx xmit, we used some tips. If the CPU run by TX
> NAPI last time is other CPUs, use IPI to wake up NAPI on the remote CPU. If it
> is also the local CPU, then we wake up napi directly.
>
> This patch set includes some refactor to the virtio-net to let that to support
> AF_XDP.
>
> ## performance
>
> ENV: Qemu with vhost-user(polling mode).
> Host CPU: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz
>
> ### virtio PMD in guest with testpmd
>
> testpmd> show port stats all
>
> ######################## NIC statistics for port 0 ########################
> RX-packets: 19531092064 RX-missed: 0 RX-bytes: 1093741155584
> RX-errors: 0
> RX-nombuf: 0
> TX-packets: 5959955552 TX-errors: 0 TX-bytes: 371030645664
>
>
> Throughput (since last show)
> Rx-pps: 8861574 Rx-bps: 3969985208
> Tx-pps: 8861493 Tx-bps: 3969962736
> ############################################################################
>
> ### AF_XDP PMD in guest with testpmd
>
> testpmd> show port stats all
>
> ######################## NIC statistics for port 0 ########################
> RX-packets: 68152727 RX-missed: 0 RX-bytes: 3816552712
> RX-errors: 0
> RX-nombuf: 0
> TX-packets: 68114967 TX-errors: 33216 TX-bytes: 3814438152
>
> Throughput (since last show)
> Rx-pps: 6333196 Rx-bps: 2837272088
> Tx-pps: 6333227 Tx-bps: 2837285936
> ############################################################################
>
> But AF_XDP consumes more CPU for tx and rx napi(100% and 86%).
>
> ## maintain
>
> I am currently a reviewer for virtio-net. I commit to maintain AF_XDP support in
> virtio-net.
>
> Please review.
>
Rethink of the whole design, I have one question:
The reason we need to store DMA information is to harden the virtqueue
to make sure the DMA unmap is safe. This seems redundant when the
buffer were premapped by the driver, for example:
Receive queue maintains DMA information, so it doesn't need desc_extra to work.
So can we simply
1) when premapping is enabled, store DMA information by driver itself
2) don't store DMA information in desc_extra
Would this be simpler?
Thanks
Powered by blists - more mailing lists