[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9dd17a20-b5b8-4385-9a61-d9647da337a9@gmail.com>
Date: Fri, 13 Jun 2025 22:58:38 +0700
From: Bui Quang Minh <minhquangbui99@...il.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org,
"Michael S. Tsirkin" <mst@...hat.com>, Jason Wang <jasowang@...hat.com>,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>, Eugenio Pérez
<eperezma@...hat.com>, Andrew Lunn <andrew+netdev@...n.ch>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
Jesper Dangaard Brouer <hawk@...nel.org>,
John Fastabend <john.fastabend@...il.com>, virtualization@...ts.linux.dev,
linux-kernel@...r.kernel.org, bpf@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH net] virtio-net: drop the multi-buffer XDP packet in
zerocopy
On 6/11/25 03:37, Jakub Kicinski wrote:
> On Tue, 10 Jun 2025 22:18:32 +0700 Bui Quang Minh wrote:
>>>> Furthermore, we are in the zerocopy so we cannot linearize by
>>>> allocating a large enough buffer to cover the whole frame then copy the
>>>> frame data to it. That's not zerocopy anymore. Also, XDP socket zerocopy
>>>> receive has assumption that the packet it receives must from the umem
>>>> pool. AFAIK, the generic XDP path is for copy mode only.
>>> Generic XDP == do_xdp_generic(), here I think you mean the normal XDP
>>> patch in the virtio driver? If so then no, XDP is very much not
>>> expected to copy each frame before processing.
>> Yes, I mean generic XDP = do_xdp_generic(). I mean that we can linearize
>> the frame if needed (like in netif_skb_check_for_xdp()) in copy mode for
>> XDP socket but not in zerocopy mode.
> Okay, I meant the copies in the driver - virtio calls
> xdp_linearize_page() in a few places, for normal XDP.
>
>>> This is only slightly related to you patch but while we talk about
>>> multi-buf - in the netdev CI the test which sends ping while XDP
>>> multi-buf program is attached is really flaky :(
>>> https://netdev.bots.linux.dev/contest.html?executor=vmksft-drv-hw&test=ping-py.ping-test-xdp-native-mb&ld-cases=1
>> metal-drv-hw means the NETIF is the real NIC, right?
> The "metal" in the name refers to the AWS instance type that hosts
> the runner. The test runs in a VM over virtio, more details:
> https://github.com/linux-netdev/nipa/wiki/Running-driver-tests-on-virtio
I've figured out the problem. When the test fails, in mergeable_xdp_get_buf
xdp_room = SKB_DATA_ALIGN(XDP_PACKET_HEADROOM +
sizeof(struct skb_shared_info));
if (*len + xdp_room > PAGE_SIZE)
return NULL;
*len + xdp_room > PAGE_SIZE and NULL is returned, so the packet is
dropped. This case happens when add_recvbuf_mergeable is called when XDP
program is not loaded, so it does not reserve space for
XDP_PACKET_HEADROOM and struct skb_shared_info. But when the vhost uses
that buffer and send back to virtio-net, XDP program is loaded. The code
has the assumption that XDP frag cannot exceed PAGE_SIZE which I think
is not correct anymore. Due to that assumption, when the frame data +
XDP_PACKET_HEADROOM + sizeof(struct skb_shared_info) > PAGE_SIZE, the
code does not build xdp_buff but drops the frame. xdp_linearize_page has
the same assumption. As I don't think the assumption is correct anymore,
the fix might be allocating a big enough buffer to build xdp_buff.
Thanks,
Quang Minh.
Powered by blists - more mailing lists