linux-kernel - Re: [PATCH RFC net 2/2] net: dsa: propagate brentry flag changes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOiHx==Wk8b43x8HLX4i-o696LioqeHoTpM+kzwn+NBE7dV8wg@mail.gmail.com>
Date: Sat, 19 Apr 2025 12:52:09 +0200
From: Jonas Gorski <jonas.gorski@...il.com>
To: Vladimir Oltean <vladimir.oltean@....com>
Cc: Nikolay Aleksandrov <razor@...ckwall.org>, Ido Schimmel <idosch@...dia.com>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>, 
	Andrew Lunn <andrew@...n.ch>, bridge@...ts.linux.dev, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC net 2/2] net: dsa: propagate brentry flag changes

On Mon, Apr 14, 2025 at 5:07 PM Vladimir Oltean <vladimir.oltean@....com> wrote:
>
> On Mon, Apr 14, 2025 at 03:49:27PM +0200, Jonas Gorski wrote:
> > I was just in the progress of writing down some thoughts myself I had
> > while thinking about this.
> >
> > So to me passing on the original flags but then no updates of these
> > flags feels wrong. I would expect them to be either never passed on,
> > or always sync it (and let the driver choose to handle or ignore
> > them).
>
> Maybe, but right now it seems like the wrong problem to tackle, and it
> is probably so for every driver for which the VLAN information in the
> receive traffic depends on the bridge VLAN configuration, and is not
> statically configured by the driver.
>
> > Having the cpu port as egress untagged I can easily find one case
> > where this breaks stuff (at least with b53):
> >
> > With lan1, lan2 being dsa ports of the same switch:
> >
> > # bridge with lan1
> > $ ip link add swbridge type bridge vlan_filtering 1
> > $ ip link set lan1 master swbridge
> > $ bridge vlan add dev swbridge vid 10 self pvid untagged
> >
> > # lan2 stand alone
> > $ ip link add lan2.10 link lan2 type vlan id 10
> >
> > as then lan2.10 would never receive any packets, as the VLAN 10
> > packets received by the CPU ports never carry any vlan tags.
> >
> > The core issue here is that b53 switches do not provide any way of
> > knowing the original tagged state of received packets, as the dsa
> > header has no field for that (bcm56* switches do, but these are a
> > different beast).
>
> I see, and indeed, this is yet another angle. The flags of the host
> bridge VLAN do not match with the flags of the flags of the RX filtering
> VLAN, the latter having this comment: "This API only allows programming
> tagged, non-PVID VIDs". The update of flags would not be propagated to
> the driver, neither with your patch nor without it, because VID=10
> already exists on the CPU port, and this isn't a "changed" VLAN (because
> it is an artificial switchdev event emitted by DSA, unbeknownst to the
> bridge). So DSA would still decide to bump the refcount rather than
> notify the driver.
>
> You'd have to ask yourself how do you even expect DSA to react and sort
> this out, between the bridge direction wanting the VLAN untagged and the
> 8021q direction wanting it tagged.

Unless the switch chip supports passing frames as is to the cpu port
(and only there), the only winning move is likely to keep the cpu port
tagged in all cases, and untag in software as needed. I guess this is
what untag_vlan_aware_bridge_pvid is for.

As a side note, b53 datasheets state that all vlans are always tagged
on egress and ignore the untag map for the management port, but that
is clearly not the case for most devices lol.

I am still a bit confused by untag_bridge_pvid, or more specifically
dsa_software_untag_vlan_unaware_bridge(). I would have expected a non
vlan_filtering bridge to ignore any vlan configuration, including
PVID. Or rather the PVID is implicit by a port being bridged with vlan
uppers of a certain vid (e.g. port1.10 + port2 => port2 has PVID 10),
but not explicitly via the bridge vlan configuration.

> > I guess the proper fix for b53 is probably to always have a vlan tag
> > on the cpu port (except for the special vlan 0 for untagged traffic on
> > ports with no PVID), and enable untag_vlan_aware_bridge_pvid.
>
> What's the story with the ports with no PVID, and VID 0?
> In Documentation/networking/switchdev.rst, it is said that VLAN
> filtering bridge ports should drop untagged and VID 0 tagged RX packets
> when there is no pvid.

Hm, that's not what the code does:

With a vlan_filtering bridge VID 0 always gets added to bridge ports
if they get set up:

The order is:

1. filtering is enabled on the port => NETIF_F_HW_VLAN_CTAG_FILTER is set
2. on if up, vlan_device_event() sees that NETIF_F_HW_VLAN_CTAG_FILTER
is enabled and calls vlan_vid_add(dev, .., 0)
3. switchdev/dsa passes this on to the dsa driver via port_vlan_add()
4. all bridge ports being up are now members of vlan 0/have vlan 0 enabled.

Not sure if this is intended/expected behavior, but this enables
untagged rx at least with b53. And since b53 can't restrict forwarding
on a per-vlan base, likely enables forwarding between all bridge ports
for untagged traffic (until a PVID vlan is configured on the bridge,
then the untagged traffic is moved to a different port). This part is
likely not intended.

My first guess was that this is intentional to allow STP & co to work
regardless if there is a PVID/egress untagged VLAN configured. Though
this will likely also require preventing of forwarding unless this is
a configured bridge vlan. Currently trying to read 802.1Q-2022 to see
if I find anything clearly stating how this should (not) work ... .

> > To continue the stream of consciousness, it probably does not make
> > sense to pass on the untagged flag for the bridge/cpu port, because it
> > will affect all ports of the switch, regardless of them being member
> > of the bridge.
>
> Though it needs to be said, usually standalone ports are VLAN-unaware,
> thus, the VLAN ID on RX from their direction is a discardable quantity.
>
> b53 is one of the special drivers, for setting ds->vlan_filtering_is_global = true.
> That makes standalone ports become VLAN filtering even when not under a
> bridge, and is what ultimately causes DSA to program RX filtering VLANs
> to hardware in the first place. Normally, 8021q uppers aren't programmed
> to hardware - see the comments above dsa_user_manage_vlan_filtering().

That's unfortunately how the switch chip works, either you have
filtering on all ports, or on none. Can't toggle it on a per-port
base. The only exception is the CPU port, where you can at least allow
it to send to any VLAN/port.

> > Looking through drivers in net/drivers/dsa, I don't see
> > anyone checking if egress untag is applied to the cpu port, so I
> > wonder if not most, maybe even all (dsa) switch drivers have the same
> > issue and would actually need to keep the cpu port always tagged.
>
> What check do you expect to see exactly? Many drivers treat VLANs on the
> CPU port in special ways, sja1105, felix/ocelot, mv88e6xxx, mt7530, ksz8, maybe others.
> Some of them are subtle and not easy to spot, because they are not from
> the .port_vlan_add() call path (like felix_update_tag_8021q_rx_rule()).

I was mostly looking if .port_vlan_add() has some special handling for
the cpu port as it gets called for vlans configured on the bridge
port, and I didn't see anyone ignoring the untagged flag if
programming the cpu port. Didn't look too deeply though.

> > And looking through the tag_* handlers, only ocelot looks like it may
> > have the information available whether a packet was originally tagged
> > or not, so would also need to have untag_vlan_aware_bridge_pvid enabled.
>
> ocelot can select between 2 different tagging protocols, "ocelot" (which
> leaves the VLAN header unmodified) and "ocelot-8021q" (which does not),
> and the latter indeed does set untag_vlan_aware_bridge_pvid. It's an
> option from which more than one driver could benefit, though, for sure.
>
> mv88e6xxx uses MV88E6XXX_G1_VTU_DATA_MEMBER_TAG_UNMODIFIED which
> provides essentially that information, neither "tag" nor "untag", but
> "keep".

Ah, I see, just checked the datasheet.

 b53 also has a "keep unmodified" mode, but that is global and applies
to all ports, so an untagged at ingress frame then can never become
egress tagged. Completely useless when also having a vlan filtering
bridge.

> > Makes the think the cpu port always tagged should be the default
> > behavior. It's easy to strip a vlan tag that shouldn't be there, but
> > hard to add one that's missing, especially since in contrast to PVID
> > you can have more than one vlan as egress untagged.
>
> I agree and I would like to see b53 converge towards that. But changing
> the default by unsetting this flag in DSA could be a breaking change, we
> should be careful, and definitely only consider that for net-next.
>
> b53 already sets a form (the deprecated form) of ds->untag_bridge_pvid,
> someone with hardware should compare its behavior to the issues
> documented in dsa_software_untag_vlan_unaware_bridge(), and, if
> necessary, transition it to ds->untag_vlan_aware_bridge_pvid or perhaps
> something else.

This is for the case when you have a b53 switch behind a b53 switch, e.g.

sw0.port1..4 user ports
sw0.cpu -> sw1.port1
sw1.port2..4 user ports
sw1.cpu -> eth0

and you have user ports on both switches. Due to the way the broadcom
tag works, if sw0 would be brcm tagged, then sw1 wouldn't see the vlan
tag anymore on rx (since there will now be the brcm tag), and all
traffic would go to the pvid of sw1.port1, thus any forwarding between
ports of sw0 and sw1 would have to be done in software.

In this case, sw0 needs to have its traffic always tagged. Also sw1
needs to keep all vlans on port1 egress tagged, not sure if we
actually take care of that ... (maybe the dsa code does that for us,
didn't check).

Although forwarding works between user ports of sw0 and sw1, from the
linux perspective we would never receive any traffic from sw0's ports,
since we only have sw1's tag, and that one will say sw1.port1.

So while you may be able to configure vlans and forwarding, probably
everything else won't work as expected. Like when you want to send out
a packet on a certain port.

Though I assume this isn't a special issue for b53, and rather a
common issue of chained switch chips, and b53 just acknowledges it /
tries to work around it. I know that marvell (E)DSA can handle this,
but I wouldn't be surprised if many others can't.

Looking at devices in OpenWrt that are affected by this, these device
have exactly one user port on sw1, and that one is mostly used as a
wan port, so as a stand-alone port, not a bridge port. One could argue
here that proper functioning of the "outer" switch's ports is more
important than being able to forward frames in hardware to the wan
port.

Unfortunately I do not have a device at hand, might need to figure out
if I can get my hands on one ... .

Best regards,
Jonas