netdev - Re: DSA: some questions regarding TX forwarding offload

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211007010820.kkj3yqnmdrh4nvo4@skbuf>
Date:   Thu, 7 Oct 2021 01:08:21 +0000
From:   Vladimir Oltean <vladimir.oltean@....com>
To:     Florian Fainelli <f.fainelli@...il.com>
CC:     Alvin Šipraga <ALSI@...g-olufsen.dk>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Andrew Lunn <andrew@...n.ch>,
        Tobias Waldekranz <tobias@...dekranz.com>
Subject: Re: DSA: some questions regarding TX forwarding offload

On Tue, Oct 05, 2021 at 07:50:40PM -0700, Florian Fainelli wrote:
> On 10/5/2021 3:12 AM, Vladimir Oltean wrote:
> > I don't want to answer any of these questions until I understand how
> > does your hardware intend the FID and FID_EN bits from the DSA header to
> > be used. The FID only has 2 bits, so it is clear to me that it doesn't
> > have the same understanding of the term as mv88e6xxx, if the Realtek
> > switch has up to 4 FIDs while Marvell up to 4K.
> > 
> > You should ask yourself not only how to prevent leakage, but also the
> > flip side, how should I pass the packet to the switch in such a way that
> > it will learn its MAC SA in the right FID, assuming that you go with FDB
> > isolation first and figure that out. Once that question is answered, you
> > can in premise specify an allowance port mask which is larger than
> > needed (the entire mask of user ports) and the switch should only
> > forward it towards the ports belonging to the same FID, which are
> > roughly equivalent with the ports under a specific bridge. You can
> > create a mapping between a FID and dp->bridge_num. Makes sense or am I
> > completely off?
> 
> Sorry for sort of hijacking this discussion.
> 
> Broadcom switches do not have FIDs however using Alvin's topology, and given
> the existing bridge support in b53, it also does not look like there is
> leaking from one bridge to other because of the port matrix configuration
> which is enforced in addition to the VLAN membership.

I don't think we're using the word "leaking" in quite the same way.
TX forwarding offload implies calling brcm_tag_xmit_ll(). How would the
port matrix configuration and VLAN membership help you? The CPU port
(ingress for this packet) should be in the forwarding matrix of all
other ports, and in all VLAN IDs. If you blindly put a port mask for
this packet that targets all user ports, it should reach all user ports
no questions asked, regardless of the bridge they're under.

Alvin's case was discussing the idea of allowing all user ports to be
candidates for destinations of this packet. Hoping that the FDB lookup
would come in and further restrict those candidates to only the ones
belonging to, say, br0. Leaking is possible if you don't have FDB
isolation between br0 and br1, because to the hardware, it's all a
single VLAN, just with a stick between one user port and the other
pretending to be a fence.

> However based on what I see in tag_dsa.c for the transmit case with
> skb->offload_fwd_mark, I would have to dig into the bridge's members
> in order to construct a bitmask of ports to provide to tag_brcm.c, so
> that does not really get me anywhere, does it?

And even then it wouldn't be correct. Not just the flooded packets come
with skb->offload_fwd_mark = true, all packets coming from the bridge
do, even the unicast ones. So the destination port mask needs to be a
subset of the ports under dp->bridge_dev.

But forget about using the destination port mask as a bitmap with more
than one bit set, and figuring out for each packet how to set it. This
mechanism wasn't created for that use case, that would require so much
rework in the network stack that it isn't even funny.

Simple question: what do Broadcom switches do if you throw a packet at
them which has no DSA tag? Forward it as usual, like sja1105, or flat
out drop, like Marvell?

> Those switches also always do double VLAN tag (802.1ad) normalization within
> their data path whenever VLAN is globally enabled within the switch,

I don't know what is double VLAN tag normalization, sorry. Is it
something like "if an outer tag TPID is present, classify the packet to
the outer VID, else if an inner tag TPID is present, classify to the
inner VID, else to the port-based default"? I'm not sure that is helpful
either in general or in this particular case.

> so in premise you could achieve the same isolation by reserving one of
> the outer VLAN tags per bridge, enabling IVL and hope that the FDB is
> looked

Hope? Well, is it or is it not? It's a bit of a pointless exercise if it isn't.

> including the outer and inner VLAN tags and not just the inner VLAN
> tag.

So I expect that if you encapsulate packets from the host in an outer
VLAN tag, the FDB will be looked up in that outer VLAN. That is exactly
what is needed in the case of VLAN-unaware bridging with proper FDB
isolation. The outer VLAN should have a value equal to the private pvid
configured on the ingress of the user ports that are under the
VLAN-unaware bridge, and all should in fact be well.
But in the case of VLAN-aware bridging, you want the switch to look up
the FDB in the same VLAN ID that the user port would classify it in,
were it to receive that same packet on ingress. So encapsulating it
wouldn't do it any good.

> If we don't have a FID concept, and not all switches do, how we can still
> support tx forwarding offload here?

Yes, sja1105 does not have a FID concept indeed. And it barely even has
a DSA tag.

If you don't have the concept of a FID, one thing I can tell you from
the get-go is that multiple VLAN-aware bridges will be broken, because
they don't have proper isolation at the level of FDB lookups among them.
So you should simply deny that configuration and operate with a single
VLAN-aware bridge, or multiple VLAN-unaware ones.

To isolate VLAN-unaware bridges between each other you can just crop
some VLANs from the 4K VID space (as many as the number of bridges you
want to support) and use them as private pvid on all ports, as well as
the encapsulating outer VLAN ID on xmit.

But anyway, here's something I don't understand: is there any field in
the Broadcom xmit DSA tag where you tell the switch in which VLAN should
the packet be processed? If there isn't, and the only mechanism by which
the switch classifies a packet to a VLAN on the IMP port is simply by
looking at the 802.1Q header; and yet it only looks at the 802.1Q header
if VLAN awareness is turned on, then bad luck, because VLAN awareness is
a global setting. So you turn it on for the IMP port => you turn it on
for all user ports as well => bye bye FDB isolation between VLAN-unaware
bridges, because bye-bye VLAN-unaware bridges.

So I just hope there is a way to inject a packet into a given VLAN
through your IMP port that does not involve turning VLAN awareness on
globally. If that is not the case, well, I don't know. Can we have a
more complete picture of the Broadcom tag other than what tag_brcm.c
sets today?