[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2547b53c-9c22-67ec-99fb-e7e2b403f9f1@gmail.com>
Date: Sun, 27 Jun 2021 09:56:53 -0600
From: David Ahern <dsahern@...il.com>
To: Guillaume Nault <gnault@...hat.com>
Cc: David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
David Ahern <dsahern@...nel.org>,
Simon Horman <simon.horman@...ronome.com>,
Martin Varghese <martin.varghese@...ia.com>,
Eli Cohen <elic@...dia.com>, Jiri Benc <jbenc@...hat.com>,
Tom Herbert <tom@...bertland.com>,
Pablo Neira Ayuso <pablo@...filter.org>,
Harald Welte <laforge@...monks.org>,
Andreas Schultz <aschultz@...p.net>,
Jonas Bonn <jonas@...rbonn.se>
Subject: Re: [PATCH net-next 0/6] net: reset MAC header consistently across L3
virtual devices
On 6/26/21 2:53 PM, Guillaume Nault wrote:
> On Sat, Jun 26, 2021 at 11:50:19AM -0600, David Ahern wrote:
>> On 6/25/21 7:32 AM, Guillaume Nault wrote:
>>> Some virtual L3 devices, like vxlan-gpe and gre (in collect_md mode),
>>> reset the MAC header pointer after they parsed the outer headers. This
>>> accurately reflects the fact that the decapsulated packet is pure L3
>>> packet, as that makes the MAC header 0 bytes long (the MAC and network
>>> header pointers are equal).
>>>
>>> However, many L3 devices only adjust the network header after
>>> decapsulation and leave the MAC header pointer to its original value.
>>> This can confuse other parts of the networking stack, like TC, which
>>> then considers the outer headers as one big MAC header.
>>>
>>> This patch series makes the following L3 tunnels behave like VXLAN-GPE:
>>> bareudp, ipip, sit, gre, ip6gre, ip6tnl, gtp.
>>>
>>> The case of gre is a bit special. It already resets the MAC header
>>> pointer in collect_md mode, so only the classical mode needs to be
>>> adjusted. However, gre also has a special case that expects the MAC
>>> header pointer to keep pointing to the outer header even after
>>> decapsulation. Therefore, patch 4 keeps an exception for this case.
>>>
>>> Ideally, we'd centralise the call to skb_reset_mac_header() in
>>> ip_tunnel_rcv(), to avoid manual calls in ipip (patch 2),
>>> sit (patch 3) and gre (patch 4). That's unfortunately not feasible
>>> currently, because of the gre special case discussed above that
>>> precludes us from resetting the MAC header unconditionally.
>>
>> What about adding a flag to ip_tunnel indicating if it can be done (or
>> should not be done since doing it is the most common)?
>
> That's feasible. I didn't do it here because I wanted to keep the
> patch series focused on L3 tunnels. Modifying ip_tunnel_rcv()'s
> prototype would also require updating erspan_rcv(), which isn't L3
> (erspan carries Ethernet frames). I was feeling such consolidation
> would be best done in a follow up patch series.
I was thinking a flag in 'struct ip_tunnel'. It's the private data for
those netdevices, so a per-instance setting. I haven't walked through
the details to know if it would work.
>
> I can repost if you feel strongly about it. Otherwise, I'll follow up
> with the ip_tunnel_rcv() consolidation in a later patch. Just let me
> know if you have any preference.
>
>>> The original motivation is to redirect bareudp packets to Ethernet
>>> devices (as described in patch 1). The rest of this series aims at
>>> bringing consistency across all L3 devices (apart from gre's special
>>> case unfortunately).
>>
>> Can you add a selftests that covers the use cases you mention in the
>> commit logs?
>
> I'm already having a selftests for most of the tunnels (and their
> different operating modes), gtp being the main exception. But it's not
> yet ready for upstream, as I'm trying to move the topology in its own
> .sh file, to keep the main selftests as simple as possible.
> I'll post it as soon as I get it in good shape.
>
That works. Thanks,
Powered by blists - more mailing lists