[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6877bed46f3a4_6ce6d29469@willemb.c.googlers.com.notmuch>
Date: Wed, 16 Jul 2025 11:01:40 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Felix Fietkau <nbd@....name>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
netdev@...r.kernel.org,
Eric Dumazet <edumazet@...gle.com>,
Neal Cardwell <ncardwell@...gle.com>,
Kuniyuki Iwashima <kuniyu@...gle.com>,
"David S. Miller" <davem@...emloft.net>,
David Ahern <dsahern@...nel.org>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>,
Willem de Bruijn <willemb@...gle.com>,
Richard Gobert <richardbgobert@...il.com>
Cc: linux-kernel@...r.kernel.org
Subject: Re: [PATCH net] net: fix segmentation after TCP/UDP fraglist GRO
Felix Fietkau wrote:
> On 06.07.25 15:45, Willem de Bruijn wrote:
> > Felix Fietkau wrote:
> >> Since "net: gro: use cb instead of skb->network_header", the skb network
> >> header is no longer set in the GRO path.
> >> This breaks fraglist segmentation, which relies on ip_hdr()/tcp_hdr()
> >
> > Only ip_hdr is in scope.
> >
> > Reviewing TCP and UDP GSO, tcp_hdr/transport header is used also
> > outside segment list. Non segment list GSO also uses ip_hdr in case
> > pseudo checksum needs to be set.
> >
> > The GSO code is called with skb->data at the relevant header, so L4
> > helpers are not strictly needed. The main issue is that data will be
> > at the L4 header, and some GSO code also needs to see the IP header
> > (e.g., for aforementioned pseudo checksum calculation).
>
> I just spent some time reviewing the code in order to understand how to
> fix it properly, and I still don't fully understand what you wrote
> above, especially the part related to "Non segment list GSO".
>
> The issue that I'm trying to fix is that the skb network header is wrong
> for all skbs stored in the frag_list of the first skb.
> The main skb is fine, since the offsets are handled by the network
> stack. For all the extra fragment skbs, the offsets are unset because we
> bypassed the part of the stack that sets them.
>
> Since non-segment-list GSO skbs don't carry extra frag_list skbs, I
> don't see how they can share the same issue.
Reviewed-by: Willem de Bruijn <willemb@...gle.com>
Good point.
This patch's approach of setting the field in the gro_receive call of
the inner L4 header (and not the UDP GRO code for the outer encap UDP
header) is definitely preferable over having to iterate over the list
of frag_list skbs again later.
The only issue is whether the setting is correct and safe in case of
tunnels.
On rereading, I think it is. Hence the Reviewed-by tag.
That case is complex enough that ideally we would have a testcase to
cover it. To be clear, I definitely don't mean to ask you to write
that. Just a note to self for the future and/or for anyone whose
workload actually depends on tunneled packets with fraglist GSO (which
is not me, actually).
The main fix in Richard's series was to have the GRO complete code no
longer depend on skb_network_header. Whether or not skb_network_header
is also still set is immaterial to the fix. We just assumed that
nothing besides that GRO complete code even referenced it.
Setting skb_network_offset unconditionally would not be correct for
the head_skb in the presence of tunneling. Because GSO handling of
tunneling requires this to be correctly set in the inner fields:
In __skb_udp_tunnel_segment:
/* setup inner skb. */
skb->encapsulation = 0;
SKB_GSO_CB(skb)->encap_level = 0;
__skb_pull(skb, tnl_hlen);
skb_reset_mac_header(skb);
skb_set_network_header(skb, skb_inner_network_offset(skb));
skb_set_transport_header(skb, skb_inner_transport_offset(skb));
After this tunnel GSO handler, the rest of the GSO stack no longer
sees the tunnel header and treats the packet as unencapsulated.
The fraglist code is only relevant for the innermost L4 transport
header. Since nothing touches the frag_list skb headers between GRO
receive, and L4 GSO fraglist segment handlers __tcp?_gso_segment_list
and __udpv?_gso_segment_list_csum, it is fine to have network offset
set, without having to handle encapsulation explicitly.
> If I misread what you wrote, please point me at the relevant code
> context that I'm missing.
>
> Thanks,
>
> - Felix
Powered by blists - more mailing lists