netdev - Re: [net-next PATCH] net: allow vlan traffic to be received under bond

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEP_g=83tCQMVPOTT82GBw6GamodpqU-SAeYyx9noWnzbGfUpg@mail.gmail.com>
Date:	Mon, 10 Oct 2011 19:43:03 -0700
From:	Jesse Gross <jesse@...ira.com>
To:	John Fastabend <john.r.fastabend@...el.com>
Cc:	Jiri Pirko <jpirko@...hat.com>,
	"davem@...emloft.net" <davem@...emloft.net>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"fubar@...ibm.com" <fubar@...ibm.com>
Subject: Re: [net-next PATCH] net: allow vlan traffic to be received under bond

On Mon, Oct 10, 2011 at 7:07 PM, John Fastabend
<john.r.fastabend@...el.com> wrote:
> On 10/10/2011 3:37 PM, Jiri Pirko wrote:
>> Mon, Oct 10, 2011 at 09:16:41PM CEST, john.r.fastabend@...el.com wrote:
>>> The following configuration used to work as I expected. At least
>>> we could use the fcoe interfaces to do MPIO and the bond0 iface
>>> to do load balancing or failover.
>>>
>>>       ---eth2.228-fcoe
>>>       |
>>> eth2 -----|
>>>          |
>>>          |---- bond0
>>>          |
>>> eth3 -----|
>>>       |
>>>       ---eth3.228-fcoe
>>>
>>> This worked because of a change we added to allow inactive slaves
>>> to rx 'exact' matches. This functionality was kept intact with the
>>> rx_handler mechanism. However now the vlan interface attached to the
>>> active slave never receives traffic because the bonding rx_handler
>>> updates the skb->dev and goto's another_round. Previously, the
>>> vlan_do_receive() logic was called before the bonding rx_handler.
>>>
>>> Now by the time vlan_do_receive calls vlan_find_dev() the
>>> skb->dev is set to bond0 and it is clear no vlan is attached
>>> to this iface. The vlan lookup fails.
>>>
>>> This patch moves the VLAN check above the rx_handler. A VLAN
>>> tagged frame is now routed to the eth2.228-fcoe iface in the
>>> above schematic. Untagged frames continue to the bond0 as
>>> normal. This case also remains intact,
>>>
>>> eth2 --> bond0 --> vlan.228
>>>
>>> Here the skb is VLAN tagged but the vlan lookup fails on eth2
>>> causing the bonding rx_handler to be called. On the second
>>> pass the vlan lookup is on the bond0 iface and completes as
>>> expected.
>>>
>>> Putting a VLAN.228 on both the bond0 and eth2 device will
>>> result in eth2.228 receiving the skb. I don't think this is
>>> completely unexpected and was the result prior to the rx_handler
>>> result.
>>>
>>> Note, the same setup is also used for other storage traffic that
>>> MPIO is used with eg. iSCSI and similar setups can be contrived
>>> without storage protocols.
>>>
>>> Signed-off-by: John Fastabend <john.r.fastabend@...el.com>
>>> ---
>>>
>>> net/core/dev.c |   22 +++++++++++-----------
>>> 1 files changed, 11 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>> index 70ecb86..8b6118a 100644
>>> --- a/net/core/dev.c
>>> +++ b/net/core/dev.c
>>> @@ -3231,6 +3231,17 @@ another_round:
>>> ncls:
>>> #endif
>>>
>>> +    if (vlan_tx_tag_present(skb)) {
>>> +            if (pt_prev) {
>>> +                    ret = deliver_skb(skb, pt_prev, orig_dev);
>>> +                    pt_prev = NULL;
>>> +            }
>>> +            if (vlan_do_receive(&skb))
>>> +                    goto another_round;
>>> +            else if (unlikely(!skb))
>>> +                    goto out;
>>> +    }
>>> +
>>>      rx_handler = rcu_dereference(skb->dev->rx_handler);
>>>      if (rx_handler) {
>>>              if (pt_prev) {
>>> @@ -3251,17 +3262,6 @@ ncls:
>>>              }
>>>      }
>>>
>>> -    if (vlan_tx_tag_present(skb)) {
>>> -            if (pt_prev) {
>>> -                    ret = deliver_skb(skb, pt_prev, orig_dev);
>>> -                    pt_prev = NULL;
>>> -            }
>>> -            if (vlan_do_receive(&skb))
>>> -                    goto another_round;
>>> -            else if (unlikely(!skb))
>>> -                    goto out;
>>> -    }
>>> -
>>>      /* deliver only exact match when indicated */
>>>      null_or_dev = deliver_exact ? skb->dev : NULL;
>>>
>>>
>>
>> Hmm, I must look at this again tomorrow but I have strong feeling this
>> will break some some scenario including vlan-bridge-macvlan.
>
> Yes please review... I tested cases with vlan, bridge, and macvlan
> components and believe this works unless I missed something.
>
> Maybe Jesse, can comment though on why this commit that moved (and
> cleaned up) the vlan tag handling put the vlan_do_receive below the
> rx_handler rather than above it. Was this intended to fix something?

The original reason was to ensure packets received from NICs that do
stripping behaved the same as those that don't.  At the time, the
packets with inline vlan tags were handled by the same code that
handled upper layer protocols so it was important that code for vlan
stripped tags be immediately before that.  Otherwise, packets might be
taken either by the bridge hook or vlan code depending the the type of
device.

However, that's no longer an issue because we now emulate vlan
acceleration by untagging packets at the beginning of
__netif_receive_skb(), so the code path will always be the same.
Furthermore, based on feedback received since that patch it seems
pretty clear that people prefer the behavior where vlan devices take
traffic first, so now that we can have both that and consistent
behavior it seems to be the way to go.

This looks correct to me:
Acked-by: Jesse Gross <jesse@...ira.com>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html