[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0Udy4=The7YqUFTA1FKKFBH-spLqVEzfqKXo=oDdWzvOFA@mail.gmail.com>
Date: Tue, 30 Jan 2018 14:38:38 -0800
From: Alexander Duyck <alexander.duyck@...il.com>
To: "Keller, Jacob E" <jacob.e.keller@...el.com>
Cc: Shannon Nelson <shannon.nelson@...cle.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"Duyck, Alexander H" <alexander.h.duyck@...el.com>
Subject: Re: macvlan devices and vlan interaction
On Tue, Jan 30, 2018 at 2:20 PM, Keller, Jacob E
<jacob.e.keller@...el.com> wrote:
>> -----Original Message-----
>> From: Alexander Duyck [mailto:alexander.duyck@...il.com]
>> Sent: Tuesday, January 30, 2018 12:49 PM
>> To: Shannon Nelson <shannon.nelson@...cle.com>
>> Cc: Keller, Jacob E <jacob.e.keller@...el.com>; netdev@...r.kernel.org; Duyck,
>> Alexander H <alexander.h.duyck@...el.com>
>> Subject: Re: macvlan devices and vlan interaction
>>
>> > Hi Jake,
>> >
>> > The current behavior seems logical to me, but I suppose Alex might argue
>> > differently. The macvlan was put onto the default lowerdev assuming the
>> > lowerdev will hand it all the default traffic, and then the macvlan splits
>> > out its own vlan traffic. As soon as the lowerdev assumption changes, it is
>> > going to change what gets pushed up to the macvlan dev. If the lowerdev is
>> > separating the vlan traffic out of the "default" flow headed to the macvlan,
>> > then the initial assumption has changed and the vlan traffic has been
>> > vectored off before it can be delivered up the stack to the macvlan.
>>
>> It depends on what your goal is. In my mind making macvlan VLAN
>> challenged is the easier solution since you just have to add some
>> pass-thru ops to the VLAN drivers and you can guarantee that you are
>> passing MAC-VLAN pair for each address on the interface for the call.
>> The alternative gets to be a bit more complex since it requires
>> multiple rules, one for non-tagged and one per VLAN for tagged
>> traffic.
>>
>> > There's an argument that the lowerdev shouldn't know anything about the
>> > upperdev's routing, just deliver to the upperdev and let the upperdev worry
>> > about it. But perhaps this becomes is a question of precedence: does the
>> > lowerdev split traffic first by mac address or by vlan tag.
>>
>> That is where things get messy. We found it splits by VLAN tag if the
>> VLAN is present on the lowerdev, or it splits by MAC if it is not.
>> That is why as Jake pointed out adding the VLAN to the lower dev
>> causes issues.
>>
>
> Yes, right now the problem is that it splits differently depending on whether or not a VLAN is present on the lower dev.
>
>> > I don't like your option 1: as you point out, it breaks current
>> > functionality, likely depended upon in some containers that are using
>> > macvlans to manage their traffic. We don't know what's going on inside that
>> > container and I don't think we want to break its ability to split its own
>> > vlans.
>>
>> Maybe we should look at an option 1.5. Mark the lowerdev as VLAN
>> challenged if any macvlan is operating with any VLANs enabled on it
>> since we can only really allow VLAN filtering to occur at one level
>> reliably. Either that or maybe we look at making VLANs and rx_handler
>> setups mutually exclusive.
>>
>
> Actually.. what if we changed the order of splitting, so that we always check macvlan MAC address first, before checking VLANs?
>
> This should work in both cases of macvlan -> VLAN -> lowerdev, or VLAN -> macvlan -> lowerdev.
The thing you have to then watch out for is how something like this
would impact bonding or bridging since both of those use the Rx
handler as well from my understanding. I suppose it would make sense
though to do Rx handler first and then VLAN since the Rx handler
should be placed on the VLAN itself if you are
bridging/bonding/macvlan over a VLAN versus the reverse.
> In the first case, the macvlan isn't directly attached to the lowerdev, so we'd do VLAN filtering first, and then the VLAN would check MAC address.
Right, that bit works without any issues.
> In the second case, even if lowerdev also had the VLAN, we'd do macvlan filtering first, and things would work.
That piece makes sense, at least for macvlan.
> Both the lowerdev VLAN and upperdev macvlan should receive traffic correctly in this case.
>
> I think this resolves the problem of which device goes to which VLAN.
Right, this works for macvlan. The only concern I have then is bond
and bridging. It is probably fine but it wouldn't hurt to check.
> I don't know if it resolves the issues with leaked VLANs, where a VLAN added to the macvlan device causes traffic for that VLAN to be received by all the MAC addresses of the lowerdev...
>
> I suppose this might not be considered a problem? The traffic could be received either way if you're in promiscuous mode. It's not like we have a sense of "trusted" configuration either.
>
> I think some separate work for the case of macvlan on top of VLAN on top of lower dev can be done as well, to enable offloading in this case. I'll have some more thoughts on that soon.
>
> Thanks,
> Jake
So the issue with all of this apears to be:
commit 2425717b27eb92b175335ca4ff0bb218cbe0cb64
Author: John Fastabend <john.r.fastabend@...el.com>
Date: Mon Oct 10 09:16:41 2011 +0000
net: allow vlan traffic to be received under bond
The problem is that patch made it so that you could put a bond on two
interfaces and still peel out traffic via VLANs. Prior to this patch
the code is the way we were already discussing.
>
>> > Like I said, I think the current behavior is mostly correct, but a version
>> > of option 2 might be good to help support offload of the mac+vlan pair into
>> > a macvlan channel.
>>
>> The only issue is I am not completely sure how option 2 solves the
>> original issue. Yes it makes the filtering more explicit, but the
>> network stack is still filtering VLANs before we get to the rx_handler
>> calls, or is this a fix that works for the offloaded approach only and
>> doesn't address the issues in the non-offloaded case? It's also
>> possible I might have missed something.
>>
>> - Alex
Powered by blists - more mailing lists