[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210628150436.GA3495@pc-23.home>
Date: Mon, 28 Jun 2021 17:04:36 +0200
From: Guillaume Nault <gnault@...hat.com>
To: David Ahern <dsahern@...il.com>
Cc: David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
David Ahern <dsahern@...nel.org>,
Simon Horman <simon.horman@...ronome.com>,
Martin Varghese <martin.varghese@...ia.com>,
Eli Cohen <elic@...dia.com>, Jiri Benc <jbenc@...hat.com>,
Tom Herbert <tom@...bertland.com>,
Pablo Neira Ayuso <pablo@...filter.org>,
Harald Welte <laforge@...monks.org>,
Andreas Schultz <aschultz@...p.net>,
Jonas Bonn <jonas@...rbonn.se>
Subject: Re: [PATCH net-next 0/6] net: reset MAC header consistently across
L3 virtual devices
On Sun, Jun 27, 2021 at 09:56:53AM -0600, David Ahern wrote:
> On 6/26/21 2:53 PM, Guillaume Nault wrote:
> > On Sat, Jun 26, 2021 at 11:50:19AM -0600, David Ahern wrote:
> >> On 6/25/21 7:32 AM, Guillaume Nault wrote:
> >>> Some virtual L3 devices, like vxlan-gpe and gre (in collect_md mode),
> >>> reset the MAC header pointer after they parsed the outer headers. This
> >>> accurately reflects the fact that the decapsulated packet is pure L3
> >>> packet, as that makes the MAC header 0 bytes long (the MAC and network
> >>> header pointers are equal).
> >>>
> >>> However, many L3 devices only adjust the network header after
> >>> decapsulation and leave the MAC header pointer to its original value.
> >>> This can confuse other parts of the networking stack, like TC, which
> >>> then considers the outer headers as one big MAC header.
> >>>
> >>> This patch series makes the following L3 tunnels behave like VXLAN-GPE:
> >>> bareudp, ipip, sit, gre, ip6gre, ip6tnl, gtp.
> >>>
> >>> The case of gre is a bit special. It already resets the MAC header
> >>> pointer in collect_md mode, so only the classical mode needs to be
> >>> adjusted. However, gre also has a special case that expects the MAC
> >>> header pointer to keep pointing to the outer header even after
> >>> decapsulation. Therefore, patch 4 keeps an exception for this case.
> >>>
> >>> Ideally, we'd centralise the call to skb_reset_mac_header() in
> >>> ip_tunnel_rcv(), to avoid manual calls in ipip (patch 2),
> >>> sit (patch 3) and gre (patch 4). That's unfortunately not feasible
> >>> currently, because of the gre special case discussed above that
> >>> precludes us from resetting the MAC header unconditionally.
> >>
> >> What about adding a flag to ip_tunnel indicating if it can be done (or
> >> should not be done since doing it is the most common)?
> >
> > That's feasible. I didn't do it here because I wanted to keep the
> > patch series focused on L3 tunnels. Modifying ip_tunnel_rcv()'s
> > prototype would also require updating erspan_rcv(), which isn't L3
> > (erspan carries Ethernet frames). I was feeling such consolidation
> > would be best done in a follow up patch series.
>
> I was thinking a flag in 'struct ip_tunnel'. It's the private data for
> those netdevices, so a per-instance setting. I haven't walked through
> the details to know if it would work.
I didn't think about that. Good idea, that looks perfectly doable. But
I'd still prefer to centralise the skb_reset_mac_header() call in a
dedicated patch set. I we did it here, we'd have to not reset the mac
header by default, to guarantee that unrelated tunnels wouldn't be
affected.
However, I think that the default behaviour should be to reset the mac
header and that only special cases, like the one in ip_gre, should
explicitely turn that off. Therefore, we'd need a follow up patch
anyway, to change the way this "reset_mac" flag would be set.
IMHO, the current approach has the advantage of clearly separating the
new feature from the refactoring. But if you feel strongly about using
a flag in struct ip_tunnel, I can rework this series.
Powered by blists - more mailing lists