[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAOiHx=nw-phPcRPRmHd6wJ5XksxXn9kRRoTuqH4JZeKHfxzD5A@mail.gmail.com>
Date: Tue, 28 Oct 2025 11:15:23 +0100
From: Jonas Gorski <jonas.gorski@...il.com>
To: Vladimir Oltean <olteanv@...il.com>
Cc: Andrew Lunn <andrew@...n.ch>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>, Álvaro Fernández Rojas <noltari@...il.com>,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net v2] net: dsa: tag_brcm: legacy: fix untagged rx on
unbridged ports for bcm63xx
On Mon, Oct 27, 2025 at 10:15 PM Vladimir Oltean <olteanv@...il.com> wrote:
>
> On Mon, Oct 27, 2025 at 08:46:21PM +0100, Jonas Gorski wrote:
> > The internal switch on BCM63XX SoCs will unconditionally add 802.1Q VLAN
> > tags on egress to CPU when 802.1Q mode is enabled. We do this
> > unconditionally since commit ed409f3bbaa5 ("net: dsa: b53: Configure
> > VLANs while not filtering").
> >
> > This is fine for VLAN aware bridges, but for standalone ports and vlan
> > unaware bridges this means all packets are tagged with the default VID,
> > which is 0.
> >
> > While the kernel will treat that like untagged, this can break userspace
> > applications processing raw packets, expecting untagged traffic, like
> > STP daemons.
> >
> > This also breaks several bridge tests, where the tcpdump output then
> > does not match the expected output anymore.
> >
> > Since 0 isn't a valid VID, just strip out the VLAN tag if we encounter
> > it, unless the priority field is set, since that would be a valid tag
> > again.
> >
> > Fixes: 964dbf186eaa ("net: dsa: tag_brcm: add support for legacy tags")
> > Signed-off-by: Jonas Gorski <jonas.gorski@...il.com>
> > ---
>
> Reviewed-by: Vladimir Oltean <olteanv@...il.com>
>
> Sorry for dropping the ball on v1. To reply to your reply there,
> https://lore.kernel.org/netdev/CAOiHx=mNnMJTnAN35D6=LPYVTQB+oEmedwqrkA6VRLRVi13Kjw@mail.gmail.com/
> I hadn't realized that b53 sets ds->untag_bridge_pvid conditionally,
> which makes any consolidation work in stable trees very complicated
> (although still desirable in net-next).
It's for some more obscure cases where we cannot use the Broadcom tag,
like a switch where the CPU port isn't a management port but a normal
port. I am not sure this really exists, but maybe Florian knows if
there are any (still used) boards where this applies.
If not, I am more than happy to reject this path as -EINVAL instead of
the current TAG_NONE with untag_bridge_pvid = true.
>
> | And to sidetrack the discussion a bit, I wonder if calling
> | __vlan_hwaccel_clear_tag() in
> | dsa_software_untag_vlan_(un)aware_bridge() without checking the
> | vlan_tci field strips 802.1p information from packets that have it. I
> | fail to find if this is already parsed and stored somewhere at a first
> | glance.
>
> 802.1p information should be parsed in vlan_do_receive() if vlan_find_dev()
> found something:
>
> skb->priority = vlan_get_ingress_priority(vlan_dev, skb->vlan_tci);
>
> This logic is invoked straight from __netif_receive_skb_core(), and
> relative to dsa_switch_rcv(), it hasn't run yet.
I see. So I presume it runs again after dsa_switch_rcv() has run? Or
rather __netif_receive_skb_core() jumps to to another_round or so?
> Apart from that and user-configured netfilter/tc rules, I don't think
> there's anything else in the kernel that processes the vlan_tci on
> ingress (of course that isn't to say anything about user space).
>
> With regard to dsa_software_untag_vlan_unaware_bridge(), which I'd like
> to see removed rather than reworked, it does force you to use a br0.0
> VLAN upper to not strip the 802.1p info, which is OK.
>
> With regard to dsa_software_untag_vlan_aware_bridge(), it only strips
> VID values which are != 0 (because the bridge PVID iself is != 0 - if it
> was zero that would be another bug, the port should have dropped the
> packet with a VLAN-aware bridge).
AFAICT if a port on a vlan aware bridge has no PVID configured and
receives a VID 0 tagged packet, then
dsa_software_untag_vlan_aware_bridge() will strip it:
Since skb-proto is 802.1Q, we call skb_vlan_untag(), which will set
skb->vlan_proto to 802.1q and skb->vlan_tci to 0
skb->vlan_all is now != 0, so skb_vlan_tag_present() returns true.
skb_vlan_tag_get_id() returns 0, since skb->vlan_tci is 0.
So on a vlan_aware bridge with untag_vlan_aware_bridge_pvid enabled we
now call dsa_software_untag_vlan_aware_bridge() with vid = 0.
In there br_vlan_get_pvid_rcu() sets pvid to 0 if no PVID is
configured since br_get_pvid() returns 0 in that case, so the
condition (vid == pvid) is fulfilled, so we call
__vlan_hwaccel_clear_tag(), stripping the tag.
Obviously for any other configurations (e.g. PVID > 0 and VID = 0) we
don't do anything.
For BCM63xx, the issue with VID 0 tagged traffic on egress on the CPU
port is only if there is no PVID configured on a port - once there is
a PVID, the ingress untagged traffic will be tagged with the
appropriate VLAN ID on egress on the CPU port, and everything is fine.
So in a sense it happens to work by accident for ports without a PVID
and untagged traffic on them.
But vlan aware bridges are the only cases where PVIDs do get
configured/are active, in all other configurations PVID is programmed
as 0 on ports.
Best regards,
Jonas
Powered by blists - more mailing lists