lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACGkMEv1m0RBWeuzgH2XnbsKZXo79Q3xqmzAPb+C5gOtUzm8WQ@mail.gmail.com>
Date: Tue, 17 Jun 2025 11:24:26 +0800
From: Jason Wang <jasowang@...hat.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: "Michael S. Tsirkin" <mst@...hat.com>, netdev@...r.kernel.org, 
	Willem de Bruijn <willemdebruijn.kernel@...il.com>, Andrew Lunn <andrew+netdev@...n.ch>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Jakub Kicinski <kuba@...nel.org>, Xuan Zhuo <xuanzhuo@...ux.alibaba.com>, 
	Eugenio Pérez <eperezma@...hat.com>, 
	Yuri Benditovich <yuri.benditovich@...nix.com>, Akihiko Odaki <akihiko.odaki@...nix.com>
Subject: Re: [PATCH RFC v3 7/8] tun: enable gso over UDP tunnel support.

On Mon, Jun 16, 2025 at 6:17 PM Paolo Abeni <pabeni@...hat.com> wrote:
>
> On 6/16/25 6:53 AM, Jason Wang wrote:
> > On Thu, Jun 12, 2025 at 7:03 PM Paolo Abeni <pabeni@...hat.com> wrote:
> >> On 6/12/25 6:55 AM, Jason Wang wrote:
> >>> On Fri, Jun 6, 2025 at 7:46 PM Paolo Abeni <pabeni@...hat.com> wrote:
> >>>> @@ -1720,8 +1732,16 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
> >>>>
> >>>>         if (tun->flags & IFF_VNET_HDR) {
> >>>>                 int vnet_hdr_sz = READ_ONCE(tun->vnet_hdr_sz);
> >>>> +               int parsed_size;
> >>>>
> >>>> -               hdr_len = tun_vnet_hdr_get(vnet_hdr_sz, tun->flags, from, &gso);
> >>>> +               if (vnet_hdr_sz < TUN_VNET_TNL_SIZE) {
> >>>
> >>> I still don't understand why we need to duplicate netdev features in
> >>> flags, and it seems to introduce unnecessary complexities. Can we
> >>> simply check dev->features instead?
> >>>
> >>> I think I've asked before, for example, we don't duplicate gso and
> >>> csum for non tunnel packets.
> >>
> >> My fear was that if
> >> - the guest negotiated VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO
> >> - tun stores the negotiated offload info netdev->features
> >> - the tun netdev UDP tunnel feature is disabled via ethtool
> >>
> >> tun may end-up sending to the guest packets without filling the tnl hdr,
> >> which should be safe, as the driver should not use such info as no GSO
> >> over UDP packets will go through, but is technically against the
> >> specification.
> >
> > Probably not? For example this is the way tun works with non tunnel GSO as well.
> >
> > (And it allows the flexibility of debugging etc).
> >
> >>
> >> The current implementation always zero the whole virtio net hdr space,
> >> so there is no such an issue.
> >>
> >> Still the additional complexity is ~5 lines and makes all the needed
> >> information available on a single int, which is quite nice performance
> >> wise. Do you have strong feeling against it?
> >
> > See above and at least we can disallow the changing of UDP tunnel GSO
> > (but I don't see too much value).
>
> I'm sorry, but I don't understand what is the suggestion/request here.
> Could you please phrase it?

I meant I don't see strong reasons to duplicate tunnel gso/csum in tun->flags:

1) extra complexity
2) non tunnel gso doesn't do this (and for macvtap, it tries to
emulate netdev->features)
3) we can find way to disallow toggling tunnel gso/csum via ethtool
(but I don't see too much value)

>
> Also please allow me to re-state my main point. The current
> implementation adds a very limited amount of code in the control path,
> and makes the data path simpler and faster - requiring no new argument
> to the tun_hdr_* helper instead of (at least) one as the other alternative.
>
> It looks like tun_hdr_* argument list could grow with every new feature,
> but I think we should try to avoid that.

See above:

1) for HOST_UDP_XXX we can assume it is enabled
2) for GUEST_UDP_XXX we can check netdev->features

So passing netdev->features seems to be sufficient, it avoids
introducing new argument when a new offload is supported in the
future.

>
> >>>> @@ -2426,7 +2460,16 @@ static int tun_xdp_one(struct tun_struct *tun,
> >>>>         if (metasize > 0)
> >>>>                 skb_metadata_set(skb, metasize);
> >>>>
> >>>> -       if (tun_vnet_hdr_to_skb(tun->flags, skb, gso)) {
> >>>> +       /* Assume tun offloads are enabled if the provided hdr is large
> >>>> +        * enough.
> >>>> +        */
> >>>> +       if (READ_ONCE(tun->vnet_hdr_sz) >= TUN_VNET_TNL_SIZE &&
> >>>> +           xdp->data - xdp->data_hard_start >= TUN_VNET_TNL_SIZE)
> >>>> +               flags = tun->flags | TUN_VNET_TNL_MASK;
> >>>> +       else
> >>>> +               flags = tun->flags & ~TUN_VNET_TNL_MASK;
> >>>
> >>> I'm not sure I get the point that we need dynamics of
> >>> TUN_VNET_TNL_MASK here. We know if tunnel gso and its csum or enabled
> >>> or not,
> >>
> >> How does tun know about VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO or
> >> VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM?
> >
> > I think it can be done in a way that works for non-tunnel gso.
> >
> > The most complicated case is probably the case HOST_UDP_TUNNEL_X is
> > enabled but GUEST_UDP_TUNNEL_X is not. In this case tun can know this
> > by:
> >
> > 1) vnet_hdr_len is large enough
> > 2) UDP tunnel GSO is not enabled in netdev->features
> >
> > If HOST_UDP_TUNNEL_X is not enabled by GUEST_UDP_TUNNEL_X is enabled,
> > it can behave like existing non-tunnel GSO: still accept the UDP GSO
> > tunnel packet.
>
> AFAICS the text above matches/describes quite accurately the
> implementation proposed in this patch and quoted above. Which in turn
> confuses me, because I don't see what you would prefer to see
> implemented differently.

I meant those can be done without using tun->flags.

>
> >> The user-space does not tell the tun device about any of the host
> >> offload features. Plain/baremetal GSO information are always available
> >> in the basic virtio net header, so there is no size check, but the
> >> overall behavior is similar - tun assumes the features have been
> >> negotiated if the relevant bits are present in the header.
> >
> > I'm not sure I understand here, there's no bit in the virtio net
> > header that tells us if the packet contains the tunnel gso field. And
> > the check of:
> >
> > READ_ONCE(tun->vnet_hdr_sz) >= TUN_VNET_TNL_SIZE
> >
> > seems to be not buggy. As qemu already did:
> >
> > static void virtio_net_set_mrg_rx_bufs(VirtIONet *n, int mergeable_rx_bufs,
> >                                        int version_1, int hash_report)
> > {
> >     int i;
> >     NetClientState *nc;
> >
> >     n->mergeable_rx_bufs = mergeable_rx_bufs;
> >
> >     if (version_1) {
> >         n->guest_hdr_len = hash_report ?
> >             sizeof(struct virtio_net_hdr_v1_hash) :
> >             sizeof(struct virtio_net_hdr_mrg_rxbuf);
> >         n->rss_data.populate_hash = !!hash_report;
> >
> > ...
>
> Note that the qemu code quoted above does not include tunnel handling.
>
> TUN_VNET_TNL_SIZE (== sizeof(struct virtio_net_hdr_v1_tunnel)) will be
> too small when VIRTIO_NET_F_HASH_REPORT is enabled, too.
>
> I did not handle that case here, due to the even greater overlapping with:
>
> https://lore.kernel.org/netdev/20250530-rss-v12-0-95d8b348de91@daynix.com/
>
> What I intended to do is:
> - set another bit in `flags` according to the negotiated
> VIRTIO_NET_F_HASH_REPORT value
> - use such info in tun_vnet_parse_size() to computed the expected vnet
> hdr len correctly.
> - replace TUN_VNET_TNL_SIZE usage in tun.c with tun_vnet_parse_size() calls
>
> I'm unsure if the above answer your question/doubt.

For hash reporting since we don't have a netdev feature for that, a
new argument for tun_hdr_XXX() is probably needed for that

>
> Anyhow I now see that keeping the UDP GSO related fields offset constant
> regardless of VIRTIO_NET_F_HASH_REPORT would remove some ambiguity from
> the relevant control path.

Not a native speaker but I think the ambiguity mainly come from the
"Only if" that is something like

"Only if VIRTIO_NET_F_HASH_REPORT negotiated"

>
> I think/hope we are still on time to update the specification clarifying
> that, but I'm hesitant to take that path due to the additional
> (hopefully small) overhead for the data path - and process overhead TBH.
>
> On the flip (positive) side such action will decouple more this series
> from the HASH_REPORT support.
>
> Please advice, thanks!
>
> /P
>

Thanks


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ