lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CADvbK_dQ6kUamSW8zxKgydYpZxesHPcqKo+2eV7DZiGU4Vazng@mail.gmail.com>
Date: Thu, 20 Mar 2025 14:37:01 -0400
From: Xin Long <lucien.xin@...il.com>
To: Cosmin Ratiu <cratiu@...dia.com>
Cc: Mark Bloch <mbloch@...dia.com>, "chopps@...n.net" <chopps@...n.net>, 
	"davem@...emloft.net" <davem@...emloft.net>, 
	"herbert@...dor.apana.org.au" <herbert@...dor.apana.org.au>, Dragos Tatulea <dtatulea@...dia.com>, 
	"steffen.klassert@...unet.com" <steffen.klassert@...unet.com>, 
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>, "pabeni@...hat.com" <pabeni@...hat.com>, 
	"horms@...nel.org" <horms@...nel.org>, "edumazet@...gle.com" <edumazet@...gle.com>, 
	"kuba@...nel.org" <kuba@...nel.org>, Yael Chemla <ychemla@...dia.com>, 
	"wangfe@...gle.com" <wangfe@...gle.com>
Subject: Re: [PATCH net] xfrm: Force software GSO only in tunnel mode

On Thu, Mar 20, 2025 at 11:24 AM Cosmin Ratiu <cratiu@...dia.com> wrote:
>
> On Mon, 2025-03-17 at 15:30 -0400, Xin Long wrote:
> > For UDP tunnels, there are two types:
> >
> > - ENCAP_TYPE_ETHER encaps an ether packet (e.g., VXLAN, Geneve).
> > - ENCAP_TYPE_IPPROTO encaps an ipproto packet (e.g., SCTP over UDP).
> >
> > When performing GSO via skb_udp_tunnel_segment():
> >
> > - ENCAP_TYPE_ETHER relies on inner_network_header to locate the
> >   network header.
> > - ENCAP_TYPE_IPPROTO relies on inner_transport_header to locate
> >   the transport header.
> >
> > However, both IPsec transport and tunnel modes modify
> > inner_transport_header. This patch raises a concern that GSO may
> > not work correctly for ENCAP_TYPE_IPPROTO UDP tunnels over IPsec
> > in transport mode.
>
> skb_udp_tunnel_segment -> __skb_udp_tunnel_segment does:
>
> tnl_hlen = skb_inner_mac_header(skb) - skb_transport_header(skb);
> __skb_pull(skb, tnl_hlen);
> skb_reset_mac_header(skb);
> skb_set_network_header(skb, skb_inner_network_offset(skb));
> skb_set_transport_header(skb, skb_inner_transport_offset(skb));
>
> As a concrete example, in case of TCP over Geneve over IPsec in
> transport mode, this is the sequence of header manipulations done:
> geneve_build_skb: mh 190 nh 204 th 224 imh 190 inh 204 ith 224 data 182
> iptunnel_xmit: mh 190 nh 154 th 174 imh 190 inh 204 ith 224 data 154
> xfrm4_transport_output: mh 147 nh 138 th 158 imh 190 inh 204 ith 174
> data 174
> skb_mac_gso_segment: mh 124 nh 138 th 158 imh 190 inh 204 ith 174 data
> 124
> inet_gso_segment: mh 124 nh 138 th 158 imh 190 inh 204 ith 174 data 138
> esp4_gso_segment: mh 124 nh 138 th 158 imh 190 inh 204 ith 174 data 158
> __skb_udp_tunnel_segment: mh 124 nh 138 th 174 imh 190 inh 204 ith 174
> data 174
> __skb_udp_tunnel_segment: tnl_hlen 16
> __skb_udp_tunnel_segment: inner skb mh 190 nh 204 th 174 imh 190 inh
> 204 ith 174 data 190
> skb_mac_gso_segment: mh 190 nh 204 th 174 imh 190 inh 204 ith 174 data
> 190
> inet_gso_segment: mh 190 nh 204 th 174 imh 190 inh 204 ith 174 data 204
> tcp_gso_segment: mh 190 nh 204 th 224 imh 190 inh 204 ith 174 data 224
>
> All numbers are offsets from skb->head printed at function start
> (except for '__skb_udp_tunnel_segment: inner', printed after the code
> block mentioned above).
> I see that xfrm4_transport_output moves the inner transport header
> forward (to 174) and that __skb_udp_tunnel_segment incorrectly sets
> transport header to it, but fortunately inet_gso_segment resets it to
> the correct value after pulling the ip header.
>

I agree that it works for ENCAP_TYPE_ETHER tunnels in transport mode
since they do not rely on the incorrect transport header set in
__skb_udp_tunnel_segment().

> In case of ENCAP_TYPE_IPPROTO, inet_gso_segment/ipv6_gso_segment would
> be invoked directly by __skb_udp_tunnel_segment and it would see the
> network header set correctly. But both compute nhoff like this:
> nhoff = skb_network_header(skb) - skb_mac_header(skb);
> Which would be 0 given mac_header is set to the same offset as the ip
> header.
> But that only makes the pskb_may_pull check & the skb_pull not do
> anything. The functions then proceed to set up the transport header
> correctly
> I think the code might still work but I haven't verified with a
> ENCAP_TYPE_IPPROTO protocol.

For ENCAP_TYPE_IPPROTO, I believe it does not invoke inet_gso_segment()
or ipv6_gso_segment(). Instead, it directly calls the transport layer's
.gso_segment function, such as sctp_gso_segment() for SCTP over a UDP
tunnel. This behavior can be seen in skb_udp_tunnel_segment():

  case ENCAP_TYPE_IPPROTO:
      offloads = is_ipv6 ? inet6_offloads : inet_offloads;
      ops = rcu_dereference(offloads[skb->inner_ipproto]);

skb->inner_ipproto == IPPROTO_SCTP and
inet(6)_offloads[IPPROTO_SCTP] == sctp_gso_segment

sctp_gso_segment() assumes the transport header is correctly set and
retrieves the SCTP header using skb_transport_header(skb). However,
the transport header is incorrectly set earlier in the code:

  skb_set_network_header(skb, skb_inner_network_offset(skb));
  skb_set_transport_header(skb, skb_inner_transport_offset(skb));  <---
  ...
  segs = gso_inner_segment(skb, features);

With an incorrect sctphdr, this could lead to unexpected behavior or
even a crash.

I haven't had time to verify it either, but there are perhaps other
ENCAP_TYPE_IPPROTO tunnels easier to set up for testing.

>
> In general, while staring at this code, I got the impression that these
> functions are brittle, relying on assumptions made in completely
> different areas that might easily be broken given a different
> combination of protocols.
>
> I think that the code block in __skb_udp_tunnel_segment could be made
> less brittle if it stops relying on the saved inner headers (which
> might not be set to the actual inner protocols about to be handled),
> and instead parse the mac or network headers and set the next headers
> accordingly. This might even allow back HW GSO for XFRM in tunnel mode
> with crypto offload, like before your original 2020 patch. WDYT?
>

The inner_network_header and inner_transport_header were introduced
to support hardware-offloaded encapsulation. skb_udp_tunnel_segment()
may be just simply mimicking the hardware behavior and wasn’t
designed to handle nested encapsulation.

>From what I can see:

For ENCAP_TYPE_ETHER, these headers help avoid extra CPU work by
skipping link-layer parsing to locate the network header, making
the link layer transparent to tunnel segmentation.

For ENCAP_TYPE_IPPROTO, without inner_transport_header, identifying
the transport header solely by parsing the inner packet may not be
possible. For example, distinguishing between these two UDP tunnels
in tunnel segmentation:

  FOU: ETH | IP | UDP | IPPROTO_XXX
  GUE: ETH | IP | UDP | GUE HDR | IPPROTO_XXX

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ