[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yae6lGvTt8sCtLJX@lunn.ch>
Date: Wed, 1 Dec 2021 19:10:28 +0100
From: Andrew Lunn <andrew@...n.ch>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
David Ahern <dsahern@...nel.org>,
James Prestwood <prestwoj@...il.com>,
Justin Iurman <justin.iurman@...ege.be>,
Praveen Chaudhary <praveen5582@...il.com>,
"Jason A . Donenfeld" <Jason@...c4.com>,
Eric Dumazet <edumazet@...gle.com>,
netdev <netdev@...r.kernel.org>
Subject: Re: [patch RFC net-next 2/3] icmp: ICMPV6: Examine invoking packet
for Segment Route Headers.
On Wed, Dec 01, 2021 at 09:33:32AM -0800, Willem de Bruijn wrote:
> > include/linux/ipv6.h | 2 ++
> > net/ipv6/icmp.c | 36 +++++++++++++++++++++++++++++++++++-
> > 2 files changed, 37 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
> > index 20c1f968da7c..d8ab5022d397 100644
> > --- a/include/linux/ipv6.h
> > +++ b/include/linux/ipv6.h
> > @@ -133,6 +133,7 @@ struct inet6_skb_parm {
> > __u16 dsthao;
> > #endif
> > __u16 frag_max_size;
> > + __u16 srhoff;
>
> Out of scope for this patch, but I guess we could use a
>
> BUILD_BUG_ON(sizeof(struct inet6_skb_parm) > sizeof_field(struct sk_buff, cb));
There is something like that already. I triggered a BUILD_BUG_ON
failure when i put the actual IPv6 destination address here, rather
than an offset to it.
> > #define IP6SKB_XFRM_TRANSFORMED 1
> > #define IP6SKB_FORWARDED 2
> > @@ -142,6 +143,7 @@ struct inet6_skb_parm {
> > #define IP6SKB_HOPBYHOP 32
> > #define IP6SKB_L3SLAVE 64
> > #define IP6SKB_JUMBOGRAM 128
> > +#define IP6SKB_SEG6 512
>
> 256?
Doh!
> > +static void icmpv6_notify_srh(struct sk_buff *skb, struct inet6_skb_parm *opt)
> > +{
> > + struct sk_buff *skb_orig;
> > + struct ipv6_sr_hdr *srh;
> > +
> > + skb_orig = skb_clone(skb, GFP_ATOMIC);
> > + if (!skb_orig)
> > + return;
>
> Is this to be allowed to write to skb->cb? Or because seg6_get_srh
> calls pskb_may_pull to parse the headers?
This is an ICMP error message. So we have an IP packet, skb, which
contains in the message body the IP packet which invoked the error. If
we pass skb to seg6_get_srh() it will look in the received ICMP
packet. But we actually want to find the SRH in the packet which
invoked the error, the one which is in the message body. So the code
makes a clone of the skb, and then updates the pointers so that it
points to the invoking packet within the ICMP packet. Then we can use
seg6_get_srh() on this inner packet, since it just looks like an
ordinary IP packet.
> It is unlikely (not impossible) in this path for the packet to be
> shared or cloned. Avoid this operation when it isn't? Most packets
> will not actually have segment routing, so this imposes significant
> cost on the common case (if in the not common ICMP processing path).
>
> nit: I found the name skb_orig confusing, as it is not in the meaning
> of preserve the original skb as at function entry.
skb_invoking? That seems to be the ICMP terminology?
> > + skb_dst_drop(skb_orig);
> > + skb_reset_network_header(skb_orig);
> > +
> > + srh = seg6_get_srh(skb_orig, 0);
> > + if (!srh)
> > + goto out;
> > +
> > + if (srh->type != IPV6_SRCRT_TYPE_4)
> > + goto out;
> > +
> > + opt->flags |= IP6SKB_SEG6;
> > + opt->srhoff = (unsigned char *)srh - skb->data;
>
> Should this offset be against skb->head, in case other data move
> operations could occur?
I copied the idea from get_srh(). It does:
srh = (struct ipv6_sr_hdr *)(skb->data + srhoff);
So i'm just undoing it.
> Also, what happens if the header was in a frags that was pulled by
> pskb_may_pull in seg6_get_srh.
Yes, i checked that. Because the skb has been cloned, if it needs to
rearrange the packet because it goes over a fragment boundary,
pskb_may_pull() will return false. And then we won't find the
SRH. Nothing bad happens, traceroute is till broken as before. What
is a typical fragment size? We basically need a MAC header, IPv6
header, ICMP Header and another IP header. 14 + 40 + 8 + 40. Plus the
SRH headers. So if 128 byte fragments are being used, then yes, it
could be an issue. But is that realistic? It seems more likely 1K, 2K
or 4K fragments are used?
> If we can expect headers to exist in the linear segment, then perhaps
> the whole code can be simplified and the clone can be avoided.
It will require seg6_get_srh() to be re-written so that you can tell
it to look at a nested IP header. Which actually means ipv6_find_hdr()
needs re-writing. Things like the helper ipv6_hdr(skb) point to the
ICMP packet IP header, not the invoking IP packet header inside the
ICMP packet. I didn't like the idea of such a rewrite.
Andrew
Powered by blists - more mailing lists