[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8734njyn8j.fsf@cloudflare.com>
Date: Mon, 05 Aug 2024 17:59:24 +0200
From: Jakub Sitnicki <jakub@...udflare.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>, Eric
Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo
Abeni <pabeni@...hat.com>, Willem de Bruijn <willemb@...gle.com>,
kernel-team@...udflare.com,
syzbot+e15b7e15b8a751a91d9a@...kaller.appspotmail.com
Subject: Re: [PATCH net v2 1/2] gso: Skip bad offload detection when device
supports requested GSO
On Mon, Aug 05, 2024 at 10:25 AM -04, Willem de Bruijn wrote:
> Jakub Sitnicki wrote:
>> On Thu, Aug 01, 2024 at 03:13 PM -04, Willem de Bruijn wrote:
[...]
>> > It's a bit odd, in that the ip_summed == CHECKSUM_NONE ends up just
>> > being ignored and devices are trusted to always be able to checksum
>> > offload when they can segment offload -- even when the device does not
>> > advertise checksum offload.
>> >
>> > I think we should have a follow-on that makes advertising
>> > NETIF_F_GSO_UDP_L4 dependent on having at least one of the
>> > NETIF_F_*_CSUM bits set (handwaving over what happens when only
>> > advertising NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM).
>>
>> I agree. I've also gained some clarity as to how the fix should
>> look. Let's circle back to it, if we still think it's relevant once we
>> hash out the fix.
>>
>> After spending some quality time debugging the addded regression test
>> [1], I've realized this fix is wrong.
>>
>> You see, with commit 10154dbded6d ("udp: Allow GSO transmit from devices
>> with no checksum offload"), I've opened up the UDP_SEGMENT API to two
>> uses, which I think should not be allowed:
>>
>> 1. Hardware USO for IPv6 dgrams with extension headers
>>
>> Previously that led to -EIO, because __ip6_append_data won't annotate
>> such packets as CHECKSUM_PARTIAL.
>>
>> I'm guessing that we do this because some drivers that advertise csum
>> offload can't actually handle checksumming when extension headers are
>> present.
>>
>> Extension headers are not part of IPv6 pseudo header, but who knows what
>> limitations NIC firmwares have.
>>
>> Either way, changing it just like that sounds risky, so I think we need
>> to fall back to software USO with software checksum in this case.
>>
>> Alternatively, we could catch it in the udp layer and error out with EIO
>> as ealier. But that shifts some burden onto the user space (detect and
>> segment before sendmsg()).
>>
>> 2. Hardware USO when hardware csum is unsupported / disabled
>>
>> That sounds like a pathological device configuration case, but since it
>> is possible today with some drivers to disable csum offload for one IP
>> version and not the other, it seems safest to just handle that
>> gracefully.
>>
>> I don't know why one might want to do that. Perhaps as a workaround for
>> some firmware bug while waiting for a fix?
>
> I doubt that this is actually used. But today it can be configured.
>
> Which is why I suggested making NETIF_F_GSO_UDP_L4 dependent on csum
> offload (in netdev_fix_features). I doubt that that will break any
> real user.
Sounds like a plan. If we're talking about simply disabling GSO_UDP_L4
whenever either NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM is "off", then that
is straightforward. And the NETIF_F_HW_CSUM dependency is clear.
I could even piggy back it on this series, at the risk of additional
iterations.
>
>> In this scenario I think we also need to fall back to software USO and
>> checksum.
>>
>> Code-wise that could look like below. WDYT?
>
> Since this only affects USO, can we fix this is in __udp_gso_segment.
> Basically, not taking the NETIF_F_GSO_ROBUST path.
>
> skb_segment is so complicated already. Whatever we can do to avoid
> adding to that.
skb_segment is a complex beast. No disagreement there.
Keeping the changes down seems doable. We can drive skb_segment to
compute the checksum, when we know that's needed (because IPv6 extension
headers are present -> ip_summed is CHECKSUM_NONE) by masking off csum
flags. Thanks for the suggestion.
[...]
Powered by blists - more mailing lists