[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <614d456e-13d9-439d-9520-ad22c8be0327@gmail.com>
Date: Mon, 15 Sep 2025 11:02:45 +0200
From: Richard Gobert <richardbgobert@...il.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>, netdev@...r.kernel.org
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, horms@...nel.org, corbet@....net, saeedm@...dia.com,
tariqt@...dia.com, mbloch@...dia.com, leon@...nel.org,
ecree.xilinx@...il.com, dsahern@...nel.org, ncardwell@...gle.com,
kuniyu@...gle.com, shuah@...nel.org, sdf@...ichev.me,
aleksander.lobakin@...el.com, florian.fainelli@...adcom.com,
alexander.duyck@...il.com, linux-kernel@...r.kernel.org,
linux-net-drivers@....com
Subject: Re: [PATCH net-next v4 3/5] net: gso: restore ids of outer ip headers
correctly
Willem de Bruijn wrote:
> Richard Gobert wrote:
>> Currently, NETIF_F_TSO_MANGLEID indicates that the inner-most ID can
>> be mangled. Outer IDs can always be mangled.
>>
>> Make GSO preserve outer IDs by default, with NETIF_F_TSO_MANGLEID allowing
>> both inner and outer IDs to be mangled.
>>
>> This commit also modifies a few drivers that use SKB_GSO_FIXEDID directly.
>>
>> Signed-off-by: Richard Gobert <richardbgobert@...il.com>
>> ---
>> .../networking/segmentation-offloads.rst | 9 ++++-----
>> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 8 ++++++--
>> drivers/net/ethernet/sfc/ef100_tx.c | 17 +++++++++++++----
>> include/linux/netdevice.h | 9 +++++++--
>> include/linux/skbuff.h | 6 +++++-
>> net/core/dev.c | 4 +++-
>> net/ipv4/af_inet.c | 13 ++++++-------
>> net/ipv4/tcp_offload.c | 5 +----
>> 8 files changed, 45 insertions(+), 26 deletions(-)
>>
>> diff --git a/Documentation/networking/segmentation-offloads.rst b/Documentation/networking/segmentation-offloads.rst
>> index 085e8fab03fd..d5dccfc6b82b 100644
>> --- a/Documentation/networking/segmentation-offloads.rst
>> +++ b/Documentation/networking/segmentation-offloads.rst
>> @@ -46,7 +46,9 @@ GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP
>> ID and all segments will use the same IP ID. If a device has
>> NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when performing TSO
>> and we will either increment the IP ID for all frames, or leave it at a
>> -static value based on driver preference.
>> +static value based on driver preference. For outer headers of encapsulated
>> +packets, the device drivers must guarantee that the IPv4 ID field is
>> +incremented in the case that a given header does not have the DF bit set.
>
> Please split this into three paragraphs on FIXEDID, FIXED_INNER and
> MANGLEID.
>
> Specifically the use of FIXEDID to mean uncapped or outer should be
> explicitly mentioned (as discussed previously).
>
> Also, I understood that MANGLEID now means that both inner and outer
> IP ID can be mangled. But this comment appears to say otherwise.
> Maybe it helps to be more explicit also about behavior without DF.
>
Sure, I'll elaborate more on these features.
>> Partial Generic Segmentation Offload
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
>> index b8c609d91d11..505c4ce7cef8 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
>
>> diff --git a/drivers/net/ethernet/sfc/ef100_tx.c b/drivers/net/ethernet/sfc/ef100_tx.c
>> index e6b6be549581..24971346df00 100644
>> --- a/drivers/net/ethernet/sfc/ef100_tx.c
>> +++ b/drivers/net/ethernet/sfc/ef100_tx.c
>
> Not sure whether these driver changes need to be separate patches.
>
I updated the drivers in the same patch to keep the kernel in a
stable state after this patch.
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index f3a3b761abfb..3d19c888b839 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -5290,13 +5290,18 @@ void skb_warn_bad_offload(const struct sk_buff *skb);
>>
>> static inline bool net_gso_ok(netdev_features_t features, int gso_type)
>> {
>> - netdev_features_t feature = (netdev_features_t)gso_type << NETIF_F_GSO_SHIFT;
>> + netdev_features_t feature;
>> +
>> + if (gso_type & (SKB_GSO_TCP_FIXEDID | SKB_GSO_TCP_FIXEDID_INNER))
>> + gso_type |= __SKB_GSO_TCP_FIXEDID;
>> +
>> + feature = ((netdev_features_t)gso_type << NETIF_F_GSO_SHIFT) & NETIF_F_GSO_MASK;
>>
>> /* check flags correspondence */
>> BUILD_BUG_ON(SKB_GSO_TCPV4 != (NETIF_F_TSO >> NETIF_F_GSO_SHIFT));
>> BUILD_BUG_ON(SKB_GSO_DODGY != (NETIF_F_GSO_ROBUST >> NETIF_F_GSO_SHIFT));
>> BUILD_BUG_ON(SKB_GSO_TCP_ECN != (NETIF_F_TSO_ECN >> NETIF_F_GSO_SHIFT));
>> - BUILD_BUG_ON(SKB_GSO_TCP_FIXEDID != (NETIF_F_TSO_MANGLEID >> NETIF_F_GSO_SHIFT));
>> + BUILD_BUG_ON(__SKB_GSO_TCP_FIXEDID != (NETIF_F_TSO_MANGLEID >> NETIF_F_GSO_SHIFT));
>> BUILD_BUG_ON(SKB_GSO_TCPV6 != (NETIF_F_TSO6 >> NETIF_F_GSO_SHIFT));
>> BUILD_BUG_ON(SKB_GSO_FCOE != (NETIF_F_FSO >> NETIF_F_GSO_SHIFT));
>> BUILD_BUG_ON(SKB_GSO_GRE != (NETIF_F_GSO_GRE >> NETIF_F_GSO_SHIFT));
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index ca8be45dd8be..cf95b325f9b4 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -674,7 +674,7 @@ enum {
>> /* This indicates the tcp segment has CWR set. */
>> SKB_GSO_TCP_ECN = 1 << 2,
>>
>> - SKB_GSO_TCP_FIXEDID = 1 << 3,
>> + __SKB_GSO_TCP_FIXEDID = 1 << 3,
>>
>> SKB_GSO_TCPV6 = 1 << 4,
>>
>> @@ -707,6 +707,10 @@ enum {
>> SKB_GSO_FRAGLIST = 1 << 18,
>>
>> SKB_GSO_TCP_ACCECN = 1 << 19,
>> +
>> + /* These don't correspond with netdev features. */
>
> Can use clarification. Something like
>
> /* These indirectly together map onto the same netdev feature:
> * If NETIF_F_TSO_MANGLE is set it may mangle both inner and outer.
> */
NP. I'll use something like the comment you suggested.
>> + SKB_GSO_TCP_FIXEDID = 1 << 30,
>> + SKB_GSO_TCP_FIXEDID_INNER = 1 << 31,
>> };
>>
>> #if BITS_PER_LONG > 32
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 93a25d87b86b..f57c8dbf307f 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -3769,7 +3769,9 @@ static netdev_features_t gso_features_check(const struct sk_buff *skb,
>> features &= ~dev->gso_partial_features;
>>
>> /* Make sure to clear the IPv4 ID mangling feature if the
>> - * IPv4 header has the potential to be fragmented.
>> + * IPv4 header has the potential to be fragmented. For
>> + * encapsulated packets, the outer headers are guaranteed to
>> + * have incrementing IDs if DF is not set.
>
> This is saying that if !DF then both inner and outer must be
> incrementing?
>
> Maybe the outer headers are [also] garuanteed to have incrementing IDs.
>
You mean the inner headers? What I'm saying is that there is no need to clear
the MANGLEID feature if the outer header doesn't have the DF-bit set, since the
driver is guaranteed to generate incrementing IDs for the outer header in that case.
I also stated this in the documentation. See my previous discussion with Paolo.
I'll change this comment so that it is a bit clearer.
Discussion with Paolo: https://lore.kernel.org/netdev/a88ee88c-707f-4266-b514-d0390166dedb@gmail.com/
>> */
>> if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4) {
>> struct iphdr *iph = skb->encapsulation ?
>> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
>> index 76e38092cd8a..fc7a6955fa0a 100644
>> --- a/net/ipv4/af_inet.c
>> +++ b/net/ipv4/af_inet.c
>> @@ -1393,14 +1393,13 @@ struct sk_buff *inet_gso_segment(struct sk_buff *skb,
>>
>> segs = ERR_PTR(-EPROTONOSUPPORT);
>>
>> - if (!skb->encapsulation || encap) {
>> - udpfrag = !!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP);
>> - fixedid = !!(skb_shinfo(skb)->gso_type & SKB_GSO_TCP_FIXEDID);
>> + /* fixed ID is invalid if DF bit is not set */
>> + fixedid = !!(skb_shinfo(skb)->gso_type & (SKB_GSO_TCP_FIXEDID << encap));
>> + if (fixedid && !(ip_hdr(skb)->frag_off & htons(IP_DF)))
>> + goto out;
>>
>> - /* fixed ID is invalid if DF bit is not set */
>> - if (fixedid && !(ip_hdr(skb)->frag_off & htons(IP_DF)))
>> - goto out;
>> - }
>> + if (!skb->encapsulation || encap)
>> + udpfrag = !!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP);
>>
>> ops = rcu_dereference(inet_offloads[proto]);
>> if (likely(ops && ops->callbacks.gso_segment)) {
>> diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
>> index 1949eede9ec9..e6612bd84d09 100644
>> --- a/net/ipv4/tcp_offload.c
>> +++ b/net/ipv4/tcp_offload.c
>> @@ -471,7 +471,6 @@ INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int thoff)
>> const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
>> const struct iphdr *iph = (struct iphdr *)(skb->data + offset);
>> struct tcphdr *th = tcp_hdr(skb);
>> - bool is_fixedid;
>>
>> if (unlikely(NAPI_GRO_CB(skb)->is_flist)) {
>> skb_shinfo(skb)->gso_type |= SKB_GSO_FRAGLIST | SKB_GSO_TCPV4;
>> @@ -485,10 +484,8 @@ INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int thoff)
>> th->check = ~tcp_v4_check(skb->len - thoff, iph->saddr,
>> iph->daddr, 0);
>>
>> - is_fixedid = (NAPI_GRO_CB(skb)->ip_fixedid >> skb->encapsulation) & 1;
>> -
>> skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV4 |
>> - (is_fixedid * SKB_GSO_TCP_FIXEDID);
>> + (NAPI_GRO_CB(skb)->ip_fixedid * SKB_GSO_TCP_FIXEDID);
>
> This was only just introduced. And is still needed?
>
This was only introduced so that the previous patch doesn't affect GSO, to make each
patch more independent. Now that GSO is fixed, it is not needed.
Powered by blists - more mailing lists