netdev - Re: [PATCH ipsec v3 2/2] xfrm: Determine inner GSO type from packet inner protocol

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aQDbhJuZqFokEO31@krikkit>
Date: Tue, 28 Oct 2025 16:04:36 +0100
From: Sabrina Dubroca <sd@...asysnail.net>
To: Jianbo Liu <jianbol@...dia.com>, steffen.klassert@...unet.com
Cc: netdev@...r.kernel.org, davem@...emloft.net, kuba@...nel.org,
	Cosmin Ratiu <cratiu@...dia.com>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
	Simon Horman <horms@...nel.org>, David Ahern <dsahern@...nel.org>
Subject: Re: [PATCH ipsec v3 2/2] xfrm: Determine inner GSO type from packet
 inner protocol

2025-10-28, 21:36:17 +0800, Jianbo Liu wrote:
> 
> 
> On 10/28/2025 7:03 PM, Sabrina Dubroca wrote:
> > 2025-10-28, 04:22:48 +0200, Jianbo Liu wrote:
> > > The GSO segmentation functions for ESP tunnel mode
> > > (xfrm4_tunnel_gso_segment and xfrm6_tunnel_gso_segment) were
> > > determining the inner packet's L2 protocol type by checking the static
> > > x->inner_mode.family field from the xfrm state.
> > > 
> > > This is unreliable. In tunnel mode, the state's actual inner family
> > > could be defined by x->inner_mode.family or by
> > > x->inner_mode_iaf.family. Checking only the former can lead to a
> > > mismatch with the actual packet being processed, causing GSO to create
> > > segments with the wrong L2 header type.
> > > 
> > > This patch fixes the bug by deriving the inner mode directly from the
> > > packet's inner protocol stored in XFRM_MODE_SKB_CB(skb)->protocol.
> > > 
> > > Instead of replicating the code, this patch modifies the
> > > xfrm_ip2inner_mode helper function. It now correctly returns
> > > &x->inner_mode if the selector family (x->sel.family) is already
> > > specified, thereby handling both specific and AF_UNSPEC cases
> > > appropriately.
> > 
> > (nit: I think this paragraph goes a bit too much into describing the
> > changes between versions)
> > 
> > > With this change, ESP GSO can use xfrm_ip2inner_mode to get the
> > > correct inner mode. It doesn't affect existing callers, as the updated
> > > logic now mirrors the checks they were already performing externally.
> > 
> > Sorry, maybe I wasn't clear, but I meant that the callers should also
> > be updated to not do the AF_UNSPEC check anymore (note: this will
> > cause merge conflicts with your "NULL inner_mode" cleanup patch [1]).
> > 
> > And I think it would be nicer to split the refactoring into a separate
> > patch. So this series would be:
> > 
> > patch 1: fix xfrm_dev_offload_ok and xfrm_get_inner_ipproto (same as now)
> > patch 2: modify xfrm_ip2inner_mode and remove the AF_UNSPEC check and
> >           setting inner_mode = &x->inner_mode from all callers
> >           [no behavior change, just a refactoring to prepare for patch 3]
> > patch 3: use xfrm_ip2inner_mode for GSO (same as your v2 patch 2/2)
> > 
> > Does that seem ok to you?
> > 
> > 
> > And to avoid the merge conflict with [1], maybe it also makes more
> > sense to integrate that clean up in patch 2 from the list above, so
> > for ip_vti we'd have:
> > 
> > diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
> > index 95b6bb78fcd2..89784976c65e 100644
> > --- a/net/ipv4/ip_vti.c
> > +++ b/net/ipv4/ip_vti.c
> > @@ -118,16 +118,7 @@ static int vti_rcv_cb(struct sk_buff *skb, int err)
> >   	x = xfrm_input_state(skb);
> > -	inner_mode = &x->inner_mode;
> > -
> > -	if (x->sel.family == AF_UNSPEC) {
> > -		inner_mode = xfrm_ip2inner_mode(x, XFRM_MODE_SKB_CB(skb)->protocol);
> > -		if (inner_mode == NULL) {
> > -			XFRM_INC_STATS(dev_net(skb->dev),
> > -				       LINUX_MIB_XFRMINSTATEMODEERROR);
> > -			return -EINVAL;
> > -		}
> > -	}
> > +	inner_mode = xfrm_ip2inner_mode(x, XFRM_MODE_SKB_CB(skb)->protocol);
> >   	family = inner_mode->family;
> > 
> > 
> > Does that sound reasonable?
> 
> I have a concern regarding backporting.
> 
> Patches 1 and 3 in your proposed structure are bug fixes that should ideally
> go into the ipsec tree and be suitable for stable backports.
> Patch 2 should be targeted to ipsec-next as refactoring often does.

If it's part of a bugfix series, I think it's ok to do a small refactoring.

> If so, patch 3 becomes dependent on a change that won't exist in older
> kernels, making it difficult to backport cleanly.

It shouldn't be a problem to backport the refactoring, as this is code
that doesn't change much (the code around calls of xfrm_ip2inner_mode
hasn't been modified since 2019).

> To maintain backportability for the GSO fix, I'd prefer to keep the
> modification to xfrm_ip2inner_mode within the same patch that fixes the GSO
> code (which is currently my v3 patch 2/2).
> 
> My proposed plan is:
> 
> Send the patch 1 and patch 3 (including the xfrm_ip2inner_mode change)
> together to the ipsec tree. They are self-contained fixes.

So, keep v3 of this series unchanged.

> Separately, after those are accepted, I can modify and re-submit that patch
> [1] to ipsec-next that removes the now-redundant checks from the other
> callers (VTI, etc.), leveraging the updated helper function.
> 
> This way, the critical fixes are self-contained and backportable, while the
> cleanup of other callers happens later in the development cycle.

The only (small) drawback is leaving the duplicate code checking
AF_UNSPEC in the existing callers of xfrm_ip2inner_mode, but I guess
that's ok.


Steffen, is it ok for you to

 - have a duplicate AF_UNSPEC check in callers of xfrm_ip2inner_mode
   (the existing "default to x->inner_mode, call xfrm_ip2inner_mode if
   AF_UNSPEC", and the new one added to xfrm_ip2inner_mode by this
   patch) in the ipsec tree and then in stable?

 - do the clean up (like the diff I pasted in my previous email, or
   something smaller if [1] is applied separately) in ipsec-next after
   ipsec is merged into it?


[1] https://lore.kernel.org/netdev/20251027023818.46446-1-jianbol@nvidia.com/


-- 
Sabrina