[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-+GNV_1HLyBKGeZuVkRGPEMmyQ4+MX9cLvyC1mC9a+dvg@mail.gmail.com>
Date: Wed, 8 Nov 2023 10:10:02 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Jakub Sitnicki <jakub@...udflare.com>
Cc: Willem de Bruijn <willemb@...gle.com>, netdev@...r.kernel.org, kernel-team@...udflare.com
Subject: Re: EIO on send with UDP_SEGMENT
On Wed, Nov 8, 2023 at 6:03 AM Jakub Sitnicki <jakub@...udflare.com> wrote:
>
> Hi Willem et al,
>
> We have hit the EIO error path in udp_send_skb introduced in commit bec1f6f69736
> ("udp: generate gso with UDP_SEGMENT") [0]:
>
> if (skb->ip_summed != CHECKSUM_PARTIAL || ...) {
> kfree_skb(skb);
> return -EIO;
> }
>
> ... when attempting to send a GSO packet, using UDP_SEGMENT option, from
> a TUN device which didn't have any offloads enabled (the default case).
>
> A trivial reproducer for that would be:
>
> ip tuntap add dev tun0 mode tun
> ip addr add dev tun0 192.0.2.1/24
> ip link set dev tun0 up
>
> strace -e %net python -c '
> from socket import *
> s = socket(AF_INET, SOCK_DGRAM)
> s.setsockopt(SOL_UDP, 103, 1200)
> s.sendto(b"x" * 3000, ("192.0.2.2", 9))
> '
>
> which yields:
>
> socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC, IPPROTO_IP) = 3
> setsockopt(3, SOL_UDP, UDP_SEGMENT, [1200], 4) = 0
> sendto(3, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 3000, 0, {sa_family=AF_INET, sin_port=htons(9), sin_addr=inet_addr("192.0.2.2")}, 16) = -1 EIO (Input/output error)
>
> This has been a surprise and caused us some pain. I think it comes down
> to that anyone using UDP_SEGMENT has to implement a segmentation
> fallback in user-space. Just to be on the safe side. We can't really
> assume that any TUN/TAP interface, which happens to be our egress
> device, has at least checksum offload enabled and implemented.
>
> Which is not ideal.
> So it made us wonder if anything can be done about it?
>
> As it turns out, skb_segment() in GSO path implements a software
> fallback not only for segmentation but also for checksumming [1].
>
> What is more, when we removed the skb->ip_summed == CHECKSUM_PARTIAL
> restriction in udp_send, as an experiment, we were able to observe fully
> checksummed segments in packet capture.
>
> Which brings me to my question -
>
> Do you think the restriction in udp_send_skb can be lifted or tweaked?
The argument against has been that segmentation offload offers no
performance benefit if the stack has to fall back onto software
checksumming.
If this limitation makes userspace code more complex, by having to
branch between segmentation offload and not depending on device
features, that would be an argument to drop it. As you point out, it
is not needed for correctness.
>
> Thanks,
> Jakub
>
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bec1f6f697362c5bc635dacd7ac8499d0a10a4e7
> [1] https://elixir.bootlin.com/linux/v6.6/source/net/core/skbuff.c#L4626
>
Powered by blists - more mailing lists