[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f0933e0-ab58-41b8-832b-5336618be8b3@redhat.com>
Date: Mon, 16 Jun 2025 12:17:42 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Jason Wang <jasowang@...hat.com>, "Michael S. Tsirkin" <mst@...hat.com>
Cc: netdev@...r.kernel.org, Willem de Bruijn
<willemdebruijn.kernel@...il.com>, Andrew Lunn <andrew+netdev@...n.ch>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
Eugenio Pérez <eperezma@...hat.com>,
Yuri Benditovich <yuri.benditovich@...nix.com>,
Akihiko Odaki <akihiko.odaki@...nix.com>
Subject: Re: [PATCH RFC v3 7/8] tun: enable gso over UDP tunnel support.
On 6/16/25 6:53 AM, Jason Wang wrote:
> On Thu, Jun 12, 2025 at 7:03 PM Paolo Abeni <pabeni@...hat.com> wrote:
>> On 6/12/25 6:55 AM, Jason Wang wrote:
>>> On Fri, Jun 6, 2025 at 7:46 PM Paolo Abeni <pabeni@...hat.com> wrote:
>>>> @@ -1720,8 +1732,16 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>>>>
>>>> if (tun->flags & IFF_VNET_HDR) {
>>>> int vnet_hdr_sz = READ_ONCE(tun->vnet_hdr_sz);
>>>> + int parsed_size;
>>>>
>>>> - hdr_len = tun_vnet_hdr_get(vnet_hdr_sz, tun->flags, from, &gso);
>>>> + if (vnet_hdr_sz < TUN_VNET_TNL_SIZE) {
>>>
>>> I still don't understand why we need to duplicate netdev features in
>>> flags, and it seems to introduce unnecessary complexities. Can we
>>> simply check dev->features instead?
>>>
>>> I think I've asked before, for example, we don't duplicate gso and
>>> csum for non tunnel packets.
>>
>> My fear was that if
>> - the guest negotiated VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO
>> - tun stores the negotiated offload info netdev->features
>> - the tun netdev UDP tunnel feature is disabled via ethtool
>>
>> tun may end-up sending to the guest packets without filling the tnl hdr,
>> which should be safe, as the driver should not use such info as no GSO
>> over UDP packets will go through, but is technically against the
>> specification.
>
> Probably not? For example this is the way tun works with non tunnel GSO as well.
>
> (And it allows the flexibility of debugging etc).
>
>>
>> The current implementation always zero the whole virtio net hdr space,
>> so there is no such an issue.
>>
>> Still the additional complexity is ~5 lines and makes all the needed
>> information available on a single int, which is quite nice performance
>> wise. Do you have strong feeling against it?
>
> See above and at least we can disallow the changing of UDP tunnel GSO
> (but I don't see too much value).
I'm sorry, but I don't understand what is the suggestion/request here.
Could you please phrase it?
Also please allow me to re-state my main point. The current
implementation adds a very limited amount of code in the control path,
and makes the data path simpler and faster - requiring no new argument
to the tun_hdr_* helper instead of (at least) one as the other alternative.
It looks like tun_hdr_* argument list could grow with every new feature,
but I think we should try to avoid that.
>>>> @@ -2426,7 +2460,16 @@ static int tun_xdp_one(struct tun_struct *tun,
>>>> if (metasize > 0)
>>>> skb_metadata_set(skb, metasize);
>>>>
>>>> - if (tun_vnet_hdr_to_skb(tun->flags, skb, gso)) {
>>>> + /* Assume tun offloads are enabled if the provided hdr is large
>>>> + * enough.
>>>> + */
>>>> + if (READ_ONCE(tun->vnet_hdr_sz) >= TUN_VNET_TNL_SIZE &&
>>>> + xdp->data - xdp->data_hard_start >= TUN_VNET_TNL_SIZE)
>>>> + flags = tun->flags | TUN_VNET_TNL_MASK;
>>>> + else
>>>> + flags = tun->flags & ~TUN_VNET_TNL_MASK;
>>>
>>> I'm not sure I get the point that we need dynamics of
>>> TUN_VNET_TNL_MASK here. We know if tunnel gso and its csum or enabled
>>> or not,
>>
>> How does tun know about VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO or
>> VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM?
>
> I think it can be done in a way that works for non-tunnel gso.
>
> The most complicated case is probably the case HOST_UDP_TUNNEL_X is
> enabled but GUEST_UDP_TUNNEL_X is not. In this case tun can know this
> by:
>
> 1) vnet_hdr_len is large enough
> 2) UDP tunnel GSO is not enabled in netdev->features
>
> If HOST_UDP_TUNNEL_X is not enabled by GUEST_UDP_TUNNEL_X is enabled,
> it can behave like existing non-tunnel GSO: still accept the UDP GSO
> tunnel packet.
AFAICS the text above matches/describes quite accurately the
implementation proposed in this patch and quoted above. Which in turn
confuses me, because I don't see what you would prefer to see
implemented differently.
>> The user-space does not tell the tun device about any of the host
>> offload features. Plain/baremetal GSO information are always available
>> in the basic virtio net header, so there is no size check, but the
>> overall behavior is similar - tun assumes the features have been
>> negotiated if the relevant bits are present in the header.
>
> I'm not sure I understand here, there's no bit in the virtio net
> header that tells us if the packet contains the tunnel gso field. And
> the check of:
>
> READ_ONCE(tun->vnet_hdr_sz) >= TUN_VNET_TNL_SIZE
>
> seems to be not buggy. As qemu already did:
>
> static void virtio_net_set_mrg_rx_bufs(VirtIONet *n, int mergeable_rx_bufs,
> int version_1, int hash_report)
> {
> int i;
> NetClientState *nc;
>
> n->mergeable_rx_bufs = mergeable_rx_bufs;
>
> if (version_1) {
> n->guest_hdr_len = hash_report ?
> sizeof(struct virtio_net_hdr_v1_hash) :
> sizeof(struct virtio_net_hdr_mrg_rxbuf);
> n->rss_data.populate_hash = !!hash_report;
>
> ...
Note that the qemu code quoted above does not include tunnel handling.
TUN_VNET_TNL_SIZE (== sizeof(struct virtio_net_hdr_v1_tunnel)) will be
too small when VIRTIO_NET_F_HASH_REPORT is enabled, too.
I did not handle that case here, due to the even greater overlapping with:
https://lore.kernel.org/netdev/20250530-rss-v12-0-95d8b348de91@daynix.com/
What I intended to do is:
- set another bit in `flags` according to the negotiated
VIRTIO_NET_F_HASH_REPORT value
- use such info in tun_vnet_parse_size() to computed the expected vnet
hdr len correctly.
- replace TUN_VNET_TNL_SIZE usage in tun.c with tun_vnet_parse_size() calls
I'm unsure if the above answer your question/doubt.
Anyhow I now see that keeping the UDP GSO related fields offset constant
regardless of VIRTIO_NET_F_HASH_REPORT would remove some ambiguity from
the relevant control path.
I think/hope we are still on time to update the specification clarifying
that, but I'm hesitant to take that path due to the additional
(hopefully small) overhead for the data path - and process overhead TBH.
On the flip (positive) side such action will decouple more this series
from the HASH_REPORT support.
Please advice, thanks!
/P
Powered by blists - more mailing lists