[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a88ee88c-707f-4266-b514-d0390166dedb@gmail.com>
Date: Tue, 26 Aug 2025 16:31:50 +0200
From: Richard Gobert <richardbgobert@...il.com>
To: Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
horms@...nel.org, corbet@....net, saeedm@...dia.com, tariqt@...dia.com,
mbloch@...dia.com, leon@...nel.org, ecree.xilinx@...il.com,
dsahern@...nel.org, ncardwell@...gle.com, kuniyu@...gle.com,
shuah@...nel.org, sdf@...ichev.me, aleksander.lobakin@...el.com,
florian.fainelli@...adcom.com, willemdebruijn.kernel@...il.com,
alexander.duyck@...il.com, linux-kernel@...r.kernel.org,
linux-net-drivers@....com
Subject: Re: [PATCH net-next v3 3/5] net: gso: restore ids of outer ip headers
correctly
Paolo Abeni wrote:
> On 8/25/25 3:31 PM, Richard Gobert wrote:
>> Paolo Abeni wrote:
>>> On 8/21/25 9:30 AM, Richard Gobert wrote:
>>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>>> index 68dc47d7e700..9941c39b5970 100644
>>>> --- a/net/core/dev.c
>>>> +++ b/net/core/dev.c
>>>> @@ -3772,10 +3772,9 @@ static netdev_features_t gso_features_check(const struct sk_buff *skb,
>>>> * IPv4 header has the potential to be fragmented.
>>>> */
>>>> if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4) {
>>>> - struct iphdr *iph = skb->encapsulation ?
>>>> - inner_ip_hdr(skb) : ip_hdr(skb);
>>>> -
>>>> - if (!(iph->frag_off & htons(IP_DF)))
>>>> + if (!(ip_hdr(skb)->frag_off & htons(IP_DF)) ||
>>>> + (skb->encapsulation &&
>>>> + !(inner_ip_hdr(skb)->frag_off & htons(IP_DF))))
>>>> features &= ~NETIF_F_TSO_MANGLEID;
>>>
>>> FWIW, I think the above is the problematic part causing GSO PARTIAL issues.
>>>
>>> By default UDP tunnels do not set the DF bit, and most/all devices
>>> implementing GSO_PARTIAL clear TSO for encapsulated packet when MANGLEID
>>> is not available.
>>>
>>> I think the following should workaround the problem (assuming my email
>>> client did not corrupt the diff), but I also fear this change will cause
>>> very visible regressions in existing setups.
>>>
>>
>> Thanks for the thorough review!
>>
>> To solve this issue, we can decide that MANGLEID cannot cause
>> incrementing IDs to become fixed for outer headers of encapsulated
>> packets (which is the current behavior), then just revert this diff.
>> I'll update the documentation in segmentation-offloads.rst to reflect this.
>> Do you think that would be a good solution?
>
> I'm not sure I read correctly the above, let me rephrase. You are
> suggesting that devices can set MANGLEID but they must ensure, for
> encapsulated packets, to keep incrementing IDs for outer IP headers even
> when MANGLEID is set. It that what you mean?
>
> Note that most device exposing GSO_PARTIAL can't perform the above.
I think there are two misunderstandings:
1. When I'm talking about mangled ids, I'm mostly talking about forwarded
packets whose IDs are mangled after being forwarded. You also consider
locally generated packets to have mangled IDs if the IDs are modified
during segmentation, which is fair.
2. You say that GSO partial keeps the outer IDs fixed, but AFAICT this isn't
the case. (See below)
I think you understood me correctly, but I'll explain in more detail.
My understanding is that TSO generates packets with incrementing IDs. Since
TSO can't promise to keep fixed IDs, if a packet has SKB_GSO_TCP_FIXEDID, TSO
cannot be used and software GSO must be used instead (this is checked by net_gso_ok).
If you don't care about mangled IDs, MANGLEID can be set on the device so that
TSO can still be used.
If MANGLEID is set on the device, then TSO is allowed to generate either incrementing
or fixed IDs, depending on the device's preference. However, mangling incrementing IDs
into fixed IDs is a problem when the DF-bit is not set, as the packets might then
be fragmented and fragments of different packets will share the same ID. To prevent
this, the check discussed above removes MANGLEID if the DF-bit is not set.
Currently, MANGLEID is only relevant for the inner-most header. With my patch,
MANGLEID explicitly allows the mangling of outer IDs as well, so the check must be
modified to check both the inner and the outer headers.
I suggest that we revert the diff so that only the inner-most header is checked,
allowing TSO even when the DF bit is not set on the outer header. For this to be
correct, we must explicitly define MANGLEID on the outer header to mean that TSO
is not allowed to turn incrementing IDs into fixed IDs when the DF bit is not set.
No code change is required since devices already generate incrementing IDs for the
outer headers.
>
>>> Note that the current status is incorrect - GSO partial devices are
>>> mangling the outer IP ID for encapsulated packets even when the outer
>>> header IP DF is not set.
>>>
>>> /P
>>
>> WDYM?
>
> In the following setup:
>
> TCP socket -> UDP encap device (without 'df set') -> H/W NIC exposing
> GSO partial for encap traffic
>
> with the current kernel, if the TCP socket creates a GSO packet with
> size MSS multiple, the wire packets will have the outer IP header with
> the DF bit NOT set and will have ID fixed - for all the wire packets
> corresponding to a given GSO one.
Are you sure? The documentation clearly states that for GSO partial, the device
drivers must increment outer IDs when the DF-bit is not set. For example, AFAICT,
this is what ixgbe and i40e do, but I don't have the hardware to verify.
I would think the behavior in the example you provided is undesirable, since
the packets could potentially be fragmented later.
>
> /P
>
>> Currently, when the DF-bit isn't set, it means that the IDs must
>> be incrementing. Otherwise, the packets wouldn't have been merged by GRO.
>
> Note that GSO packets can be locally generated.
Of course. I was just referring to forwarded packets.
>
>> GSO partial (and also regular GSO/TSO) generate incrementing IDs, so the
>> IDs cannot be mangled.
>
> AFAIK, most device exposing GSO partial don't increment the outer IP ID
> for encap packet (the silicon is not able to do that).
>
>> With my patch, if the IDs were originally fixed,
>> regardless of the DF-bit, TSO/GSO partial will not occur unless MANGLEID
>> is enabled.
>
> I think the above statement is a lit confusing, S/W segmentation can
> happen even if MANGLEID is enabled on the egress device: for FIXEDID
> pkts, with DF bit not set, both the current kernel code and your patch
> will clear it.
Sorry, I phrased that awkwardly.
>
> /P
>
Powered by blists - more mailing lists