[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aWTn12LAh-KnSoeM@secunet.com>
Date: Mon, 12 Jan 2026 13:23:51 +0100
From: Steffen Klassert <steffen.klassert@...unet.com>
To: Sabrina Dubroca <sd@...asysnail.net>
CC: <netdev@...r.kernel.org>, <devel@...ux-ipsec.org>
Subject: Re: [PATCH RFC ipsec-next] esp: Consolidate esp4 and esp6.
On Mon, Oct 27, 2025 at 05:52:46PM +0100, Sabrina Dubroca wrote:
> 2025-10-22, 08:03:07 +0200, Steffen Klassert wrote:
> > This patch merges common code of esp4.c and esp6.c into
> > xfrm_esp.c. This almost halves the size of the ESP
> > implementation for the prize of three indirect calls
> > on UDP/TCP encapsulation.
>
> If that turns out to be a problem, maybe we can figure out a way to
> use the INDIRECT_CALL* helpers here as well (but currently they don't
> work for modules, I played with that a while back and could look into
> it again).
I have no indication that it could be a problem so far,
but would be nice to have the INDIRECT_CALL* helpers
for modules anyway!
> > -int esp_input_done2(struct sk_buff *skb, int err)
> > -{
> [...]
> > + if (iph->saddr != x->props.saddr.a4 ||
> > + source != encap->encap_sport) {
> > + xfrm_address_t ipaddr;
> > +
> > + ipaddr.a4 = iph->saddr;
> > + km_new_mapping(x, &ipaddr, source);
> > +
> > + /* XXX: perhaps add an extra
> > + * policy check here, to see
> > + * if we should allow or
> > + * reject a packet from a
> > + * different source
> > + * address/port.
> > */
>
> Maybe we can get rid of those "XXX" comments? Unless you think the
> suggestion still makes sense. But the comments (here and in
> esp6_input_done2) have been here a long time and it doesn't seem to
> bother users.
Good point, I'll remove them in the next version.
>
> [...]
> > static int esp_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack)
>
> Looking at esp_init_state and esp6_init_state, they also have a lot in
> common (pretty much everything except some x->props.header_len
> adjustments) that could be extracted into the new file.
Yes, true. This patch is just a start, there is much more possible.
> [...]
> > +static int esp6_input_encap(struct sk_buff *skb, struct xfrm_state *x)
> > {
> > + const struct ipv6hdr *ip6h = ipv6_hdr(skb);
> > + int offset = skb_network_offset(skb) + sizeof(*ip6h);
> > + int hdr_len = skb_network_header_len(skb);
>
> As Simon mentioned, it's set/incremented but not used. The current
> code in esp6_input_done2 would then use that value in
> skb_set_transport_header, but now the common esp_input_done2 calls
> ->input_encap and doesn't get that increased value back. I think we'll
> need to have esp_input_done2 pass &hdr_len when it calls the
> ->input_encap to get the correct value and take into account ipv6
> exthdrs.
Either this, or ->input_encap returns the offset of the ipv6
exthdrs and then add this value to hdr_len in esp_input_done2().
> [...]
> > +static void esp_output_done(void *data, int err)
> > +{
> > + struct sk_buff *skb = data;
> > + struct xfrm_offload *xo = xfrm_offload(skb);
> > + void *tmp;
> > + struct xfrm_state *x;
> > +
> > + if (xo && (xo->flags & XFRM_DEV_RESUME)) {
> > + struct sec_path *sp = skb_sec_path(skb);
> > +
> > + x = sp->xvec[sp->len - 1];
> > + } else {
> > + x = skb_dst(skb)->xfrm;
> > + }
> > +
> > + tmp = ESP_SKB_CB(skb)->tmp;
> > + esp_ssg_unref(x, tmp, skb);
> > + kfree(tmp);
> > +
> > + x->type->output_encap_csum(skb);
>
> Since the ipv4 variant doesn't do anything, maybe
>
> if (x->type->output_encap_csum)
> x->type->output_encap_csum(skb);
>
> and don't set it in esp_type?
>
> OTOH it's nice to have a consistent "all CBs are always defined" and
> just call them unconditionally.
Let's keeep the CBs always defined.
> > +static struct ip_esp_hdr *esp_output_udp_encap(struct sk_buff *skb,
> > + int encap_type,
> > + struct esp_info *esp,
> > + __be16 sport,
> > + __be16 dport)
> > +{
> > + struct udphdr *uh;
> > + unsigned int len;
> > + struct xfrm_offload *xo = xfrm_offload(skb);
> > +
> > + len = skb->len + esp->tailen - skb_transport_offset(skb);
> > + if (len + sizeof(struct iphdr) > IP_MAX_MTU)
>
> This is now used by both IPv4 and IPv6, and esp6_output_udp_encap
> didn't have any adjustment in this test.
I wonder whether this check makes sense here at all. We add ESP
headers and trailers, and just in case of UDP/TCP encapsulation,
we check the size of the resulting packet. If we need such a check,
it should be on a more generic place.
Another thing I noticed when looking into this. The code
above is from ipv4 udpencap. All other encap functions do:
len = skb->len + esp->tailen - skb_transport_offset(skb);
if (len > IP_MAX_MTU)
return ERR_PTR(-EMSGSIZE);
I think this is 'kind of' correct for ipv6, but not for ipv4 tcpencap:
#define IP_MAX_MTU 0xFFFFU
#define IP6_MAX_MTU (0xFFFF + sizeof(struct ipv6hdr))
> > + return ERR_PTR(-EMSGSIZE);
> > +
> > + uh = (struct udphdr *)esp->esph;
> > + uh->source = sport;
> > + uh->dest = dport;
> > + uh->len = htons(len);
> > + uh->check = 0;
> > +
> > + /* For IPv4 ESP with UDP encapsulation, if xo is not null, the skb is in the crypto offload
> > + * data path, which means that esp_output_udp_encap is called outside of the XFRM stack.
> > + * In this case, the mac header doesn't point to the IPv4 protocol field, so don't set it.
> > + */
> > + if (!xo || encap_type != UDP_ENCAP_ESPINUDP)
> > + *skb_mac_header(skb) = IPPROTO_UDP;
>
> That was absent in esp6_output_udp_encap, commit 447bc4b1906f ("xfrm:
> Support crypto offload for outbound IPv4 UDP-encapsulated ESP packet")
> only took care of IPv4. I'm not sure adding this to the IPv6 code
> without adapting the rest of 447bc4b1906f in the esp6_offload code is
> correct.
Hm, that would mean ipv6 udpencap offload can be configured,
but does not work in the current codebase. Maybe this needs
a separate fix.
> > +
> > + return (struct ip_esp_hdr *)(uh + 1);
> > +}
>
> [...]
> > + int esp_output(struct xfrm_state *x, struct sk_buff *skb)
>
> nit: ' ' at the start of the line
Fixed.
Thanks!
Powered by blists - more mailing lists