netdev - [PATCH net-next 4/4] net: dsa: set configure_vlan_while_not

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b6ec9450-6b3e-0473-a2f9-b57016f010c1@gmail.com>
Date:   Fri, 11 Sep 2020 11:23:28 -0700
From:   Florian Fainelli <f.fainelli@...il.com>
To:     Vladimir Oltean <olteanv@...il.com>
Cc:     davem@...emloft.net, vivien.didelot@...il.com, andrew@...n.ch,
        netdev@...r.kernel.org
Subject: [PATCH net-next 4/4] net: dsa: set configure_vlan_while_not_filtering
 to true by default



On 9/11/2020 8:43 AM, Vladimir Oltean wrote:
> On Thu, Sep 10, 2020 at 08:09:19PM -0700, Florian Fainelli wrote:
>> On 9/10/2020 5:03 PM, Vladimir Oltean wrote:
>>> On Thu, Sep 10, 2020 at 02:58:04PM -0700, Florian Fainelli wrote:
>>>> On 9/9/2020 11:34 AM, Florian Fainelli wrote:
>>>>> On 9/9/2020 10:53 AM, Vladimir Oltean wrote:
>>>>>> On Wed, Sep 09, 2020 at 10:22:42AM -0700, Florian Fainelli wrote:
>>>>>>> How do you make sure that the CPU port sees the frame untagged
>>>>>>> which would
>>>>>>> be necessary for a VLAN-unaware bridge? Do you have a special remapping
>>>>>>> rule?
>>>>>>
>>>>>> No, I don't have any remapping rules that would be relevant here.
>>>>>> Why would the frames need to be necessarily untagged for a VLAN-unaware
>>>>>> bridge, why is it a problem if they aren't?
>>>>>>
>>>>>> bool br_allowed_ingress(const struct net_bridge *br,
>>>>>>               struct net_bridge_vlan_group *vg, struct sk_buff *skb,
>>>>>>               u16 *vid, u8 *state)
>>>>>> {
>>>>>>       /* If VLAN filtering is disabled on the bridge, all packets are
>>>>>>        * permitted.
>>>>>>        */
>>>>>>       if (!br_opt_get(br, BROPT_VLAN_ENABLED)) {
>>>>>>           BR_INPUT_SKB_CB(skb)->vlan_filtered = false;
>>>>>>           return true;
>>>>>>       }
>>>>>>
>>>>>>       return __allowed_ingress(br, vg, skb, vid, state);
>>>>>> }
>>>>>>
>>>>>> If I have a VLAN on a bridged switch port where the bridge is not
>>>>>> filtering, I have an 8021q upper of the bridge with that VLAN ID.
>>>>>
>>>>> Yes that is the key right there, you need an 8021q upper to pop the VLAN
>>>>> ID or push it, that is another thing that users need to be aware of
>>>>> which is a bit awkward, most expect things to just work. Maybe we should
>>>>> just refuse to have bridge devices that are not VLAN-aware, because this
>>>>> is just too cumbersome to deal with.
>>>>
>>>> With the drivers that you currently maintain and with the CPU port being
>>>> always tagged in the VLANs added to the user-facing ports, when you are
>>>> using a non-VLAN aware bridge, do you systematically add an br0.1 upper
>>>> 802.1Q device to pop/push the VLAN tag?
>>>
>>> Talking to you, I realized that I confused you uselessly. But in doing
>>> that, I actually cleared up a couple of things for myself. So thanks, I
>>> guess?
>>>
>>> This is actually a great question, and it gave me the opportunity to
>>> reflect.  So, only 1 driver that I maintain has the logic of always
>>> marking the CPU port as egress-tagged. And that would be ocelot/felix.
>>>
>>> I need to give you a bit of background.
>>> The DSA mode of Ocelot switches is more of an afterthought, and I am
>>> saying this because there is a distinction I need to make between the
>>> "CPU port module" (which is a set of queues that the CPU can inject and
>>> extract frames from), and the "NPI port" (which is an operating mode,
>>> where a regular front-panel Ethernet port is connected internally to the
>>> CPU port module and injection/extraction I/O can therefore be done via
>>> Ethernet, and that's your DSA).
>>> Basically, when the NPI mode is in use, then it behaves less like an
>>> Ethernet port, and more like a set of CPU queues that connect over
>>> Ethernet, if that makes sense.
>>
>> SYSTEMPORT + bcm_sf2 act a lot like that, too, except the CPU port still
>> obeys VLAN, buffering, classification and other switch internal rules, but
>> essentially we want to map queues from the user-facing port to DMAs used by
>> the host processor.
> 
> Digressing a lot here, but the NPI port of Ocelot switches really isn't
> like that. For example, the NPI port doesn't even need to be in the
> reachability domain for a frame to reach it. Other example, a TCAM rule
> to drop a frame won't prevent it from reaching the NPI port, if that was
> previously selected as a destination for that frame. Other example:
> there is no source address learning for traffic injected by the network
> stack over the NPI port. So, on RX, every frame that should reach the
> CPU is actually _flooded_, due to the destination being unknown. Other
> example: the NPI port is so hardcoded to wrap everything in an
> Extraction Frame Header, that it even wraps PAUSE frames in it. That one
> especially is so bad that I have a patch series in the works to simply
> disable the NPI port and use tag_8021q instead. I just hate it.
> 
>>
>>> The port settings for VLAN are bypassed, and the packet is copied as-is
>>> from ingress to the NPI port. The egress-tagged port VLAN configuration
>>> does not actually result in a VLAN header being pushed into the frame,
>>> if that egress port is the NPI port.  Instead, the classified VLAN ID
>>> (i.e. derived from the packet, or from the port-based VLAN, or from
>>> custom VLAN classification TCAM rules) is always kept in a 12-bit field
>>> of the Extraction Frame Header.
>>>
>>> Currently I am ignoring the classified VLAN from the Extraction Frame
>>> Header, and simply passing the skb as-is to the stack. As-is, meaning as
>>> the switch ingress port had received it. So, in retrospect, my patch
>>> 183be6f967fe ("net: dsa: felix: send VLANs on CPU port as
>>> egress-tagged") is nothing more than a formality to make this piece of
>>> code shut up and not error out:
>>>
>>> static int ocelot_port_set_native_vlan(struct ocelot *ocelot, int port,
>>> 				       u16 vid)
>>> {
>>> 	struct ocelot_port *ocelot_port = ocelot->ports[port];
>>> 	u32 val = 0;
>>>
>>> 	if (ocelot_port->vid != vid) {
>>> 		/* Always permit deleting the native VLAN (vid = 0) */
>>> 		if (ocelot_port->vid && vid) {
>>> 			dev_err(ocelot->dev,
>>> 				"Port already has a native VLAN: %d\n",
>>> 				ocelot_port->vid);
>>> 			return -EBUSY;
>>> 		}
>>> 		ocelot_port->vid = vid;
>>> 	}
>>>
>>> It's just now that I connected the dots and realized that.
>>>
>>> So, looks like I don't really know what it's like to always have a
>>> tagged skb on ingress, even for egress-tagged VLANs. It must suck, I
>>> guess?
>>>
>>> I think if I were in that situation, and the source port would be under
>>> a vlan_filtering=0 bridge, then I would simply pop the tag from the skb
>>> in the DSA rcv function, for all VLANs that I don't have an 8021q upper
>>> for.
>>>
>>> Explaining this, it makes a lot of sense to do what Vitesse / Microsemi
>>> / Microchip is doing, which is to copy the frame as-is to the CPU, and
>>> to also tell you, separately, what the classified VLAN is. For example,
>>> in vlan_filtering=0 mode, the classified VLAN will always be 1,
>>> regardless of how the frame is tagged, because VLAN awareness is truly
>>> turned off for the ingress port, and the port-based VLAN is always used.
>>> In this way, you have the most flexibility: you can either ignore the
>>> classified VLAN and proceed with just what was in the ingress skb (this
>>> way, you'll have a switch that is not VLAN-unaware, just "checking" as
>>> opposed to "secure". It has passed the ingress VLAN filter, but you
>>> still remember what the VLAN ID was.
>>> Or you can have a completely VLAN-unaware switch, if you pop all VLANs
>>> that you can find in the skb, and add a hwaccel tag based on the
>>> classified VLAN, if it isn't equal to the pvid of the port. This is
>>> great for things like compatibility with a vlan_filtering=0 upper bridge
>>> which is what we're talking about.
>>>
>>> Basically, this is what, I think, DSA tries to emulate with the rule of
>>> copying the flags of a user port VLAN to the CPU port too. If we had the
>>> API to request an "unmodified" VLAN (not egress-tagged, not
>>> egress-untagged, just the same as on ingress), I'm sure we'd be using
>>> that by default (useful when vlan_filtering is 1). Knowing what the
>>> classified VLAN was also can be very useful at times (like when
>>> vlan_filtering is 0), so if there was an API for that, I'm sure DSA
>>> would have used that as well. With no such APIs, we can only use
>>> approximations.
>>
>> egress unmodified is what mv88e6xxx uses which is why I do not believe they
>> have had the same issues that I had with vlan_filtering=0. For Broadcom
>> switches there is not any option to have an umodified mode, the CPU port
>> must have a valid VLAN membership (with vlan_filtering=1) and the egress
>> untagged from the CPU port to the Ethernet MAC must match the expectations
>> of the software data path behind.
>>
> 
> So excepting mv88e6xxx and ocelot/felix, you are really in the same
> situation now with b53 and starfighter as everybody else is, am I not
> right? The "pvid and not untagged" VLAN is problematic for everybody in
> vlan_filtering=0 mode.

Yes, this is a problem for any switch that forces the CPU port to be 
tagged in all VLANs added to user-ports I would say be it a driver 
decision or hardware limitation.

The customers I support with bcm_sf2 mostly use the switch ports as 
standalone network devices (which reminds me I need to test your DSA RX 
filtering series, sigh) with 802.1Q uppers on top. Bridging works, but 
does not seem to be their main use case at all that comes from having 
migrated from 3 independent Ethernet MACs to an integrated switch, and 
now back to integrated Ethernet MACs.

> 
>>>
>>>> I am about ready to submit the changes we discussed to b53, but I am still a
>>>> bit uncomfortable with this part of the change because it will make the CPU
>>>> port follow the untagged attribute of an user-facing port.
>>>>
>>>> @@ -1444,7 +1427,7 @@ void b53_vlan_add(struct dsa_switch *ds, int port,
>>>>                           untagged = true;
>>>>
>>>>                   vl->members |= BIT(port);
>>>> -               if (untagged && !dsa_is_cpu_port(ds, port))
>>>> +               if (untagged)
>>>>                           vl->untag |= BIT(port);
>>>>                   else
>>>>                           vl->untag &= ~BIT(port);
>>>> @@ -1482,7 +1465,7 @@ int b53_vlan_del(struct dsa_switch *ds, int port,
>>>>                   if (pvid == vid)
>>>>                           pvid = b53_default_pvid(dev);
>>>>
>>>> -               if (untagged && !dsa_is_cpu_port(ds, port))
>>>> +               if (untagged)
>>>>                           vl->untag &= ~(BIT(port));
>>>>
>>>
>>> Which is ok, I believe? I mean, that's the default DSA logic. If you
>>> think that isn't ok, we should change it at a more fundamental level.
>>> What we've been discussing so far is akin to your current setup, not
>>> to the one you're planning to change to, isn't it? Are there any
>>> problems with the new setup?
>>
>> The change above is functional because the CPU port ends up being egress
>> untagged in all of the bridge's default_pvid, whether vlan_filtering is 0 or
>> 1 and that works.
> 
> Yeah, heard you about one mistake cancelling another one out.
> 
>> The slightly confusing part is that a vlan_filtering=1 bridge accepts the
>> default_pvid either tagged or untagged whereas a vlan_filtering=0 bridge
>> does not, except for DHCP for instance. I would have to add a br0.1 802.1Q
>> upper to take care of the default_pvid being egress tagged on the CPU port.
>>
>> We could solve this in the DSA receive path, or the Broadcom tag receive
>> path as you say since that is dependent on the tagging format and switch
>> properties.
>>
>> With Broadcom tags enabled now, all is well since we can differentiate
>> traffic from different source ports using that 4 bytes tag.
>>
>> Where this broke was when using a 802.1Q separation because all frames that
>> egressed the CPU were egress tagged and it was no longer possible to
>> differentiate whether they came from the LAN group in VID 1 or the WAN group
>> in VID 2. But all of this should be a thing of the past now, ok, all is
>> clear again now.
> 
> Or we could do this, what do you think?

Yes, this would be working, and I just tested it with the following 
delta on top of my b53 patch:

diff --git a/drivers/net/dsa/b53/b53_common.c 
b/drivers/net/dsa/b53/b53_common.c
index 46ac8875f870..73507cff3bc4 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1427,7 +1427,7 @@ void b53_vlan_add(struct dsa_switch *ds, int port,
                         untagged = true;

                 vl->members |= BIT(port);
-               if (untagged)
+               if (untagged && !dsa_is_cpu_port(ds, port))
                         vl->untag |= BIT(port);
                 else
                         vl->untag &= ~BIT(port);
@@ -1465,7 +1465,7 @@ int b53_vlan_del(struct dsa_switch *ds, int port,
                 if (pvid == vid)
                         pvid = b53_default_pvid(dev);

-               if (untagged)
+               if (untagged && !dsa_is_cpu_port(ds, port))
                         vl->untag &= ~(BIT(port));

                 b53_set_vlan_entry(dev, vid, vl);

and it works, thanks!

> 
>  From 178a46f0f96555e17f3fcefa356e324a92dafab2 Mon Sep 17 00:00:00 2001
> From: Vladimir Oltean <olteanv@...il.com>
> Date: Fri, 11 Sep 2020 18:16:48 +0300
> Subject: [PATCH] net: bridge: pop vlan from skb if filtering is disabled but
>   it's a pvid
> 
> Currently the bridge untags VLANs from its VLAN group in
> __allowed_ingress() only when VLAN filtering is enabled.
> 
> When installing a pvid in egress-tagged mode:
> 
> ip link add dev br0 type bridge vlan_filtering 0
> ip link set swp0 master br0
> bridge vlan del dev swp0 vid 1
> bridge vlan add dev swp0 vid 1 pvid
> 
> When adding a VLAN on a DSA switch interface, DSA configures the VLAN
> membership of the CPU port using the same flags as swp0 (in this case
> "pvid and not untagged"), in an attempt to copy the frame as-is from
> ingress to the CPU.
> 
> However, in this case, the packet may arrive untagged on ingress, it
> will be pvid-tagged by the ingress port, and will be sent as
> egress-tagged towards the CPU. Otherwise stated, the CPU will see a VLAN
> tag where there was none to speak of on ingress.

We could also indicate that some DSA switch drivers systematically 
configure their CPU port to be tagged in all VLANs, so even in the 
following case:

ip link add dev br0 type bridge vlan_filtering 0
ip link set swp0 master br0

we will be receiving egress tagged frames for default_pvid because the 
CPU port is configured as egress tagged for that VLAN. In the case of a 
vlan_filtering=1, the default_pvid is accepted as tagged or untagged by 
the bridge master device. (something along those lines).

> 
> When vlan_filtering is 1, this is not a problem, as stated in the first
> paragraph, because __allowed_ingress() will pop it. But currently, when
> vlan_filtering is 0 and we have such a VLAN configuration, we need an
> 8021q upper (br0.1) to be able to ping over that VLAN.
> 
> Make the 2 cases (vlan_filtering 0 and 1) behave the same way as popping
> the pvid, if the skb happens to be tagged with it, when vlan_filtering
> is 0.
> 
> Signed-off-by: Vladimir Oltean <olteanv@...il.com>

Tested-by: Florian Fainelli <f.fainelli@...il.com>

> ---
>   net/bridge/br_vlan.c | 16 ++++++++++++++++
>   1 file changed, 16 insertions(+)
> 
> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
> index d2b8737f9fc0..b1e7211bae51 100644
> --- a/net/bridge/br_vlan.c
> +++ b/net/bridge/br_vlan.c
> @@ -580,7 +580,23 @@ bool br_allowed_ingress(const struct net_bridge *br,
>   	 * permitted.
>   	 */
>   	if (!br_opt_get(br, BROPT_VLAN_ENABLED)) {
> +		u16 vid;
> +
>   		BR_INPUT_SKB_CB(skb)->vlan_filtered = false;
> +
> +		/* See comment in __allowed_ingress about how skb can end up
> +		 * here not having a hwaccel tag
> +		 */
> +		if (unlikely(!skb_vlan_tag_present(skb) &&
> +			     skb->protocol == br->vlan_proto)) {
> +			skb = skb_vlan_untag(skb);
> +			if (unlikely(!skb))
> +				return false;
> +		}
> +
> +		if (!br_vlan_get_tag(skb, &vid) && vid == br_get_pvid(vg))
> +			__vlan_hwaccel_clear_tag(skb);
> +
>   		return true;
>   	}
>   
> 

-- 
Florian