netdev - Re: DSA: some questions regarding TX forwarding offload

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211005152557.wbn7ojk5nphos5s5@skbuf>
Date:   Tue, 5 Oct 2021 15:25:58 +0000
From:   Vladimir Oltean <vladimir.oltean@....com>
To:     Alvin Šipraga <ALSI@...g-olufsen.dk>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Florian Fainelli <f.fainelli@...il.com>,
        Andrew Lunn <andrew@...n.ch>
Subject: Re: DSA: some questions regarding TX forwarding offload

On Tue, Oct 05, 2021 at 02:38:28PM +0000, Alvin Šipraga wrote:
> On 10/5/21 3:29 PM, Vladimir Oltean wrote:
> > On Tue, Oct 05, 2021 at 12:06:38PM +0000, Alvin Šipraga wrote:
> >> The FDB isolation mechanism in my switch seems to be pretty good. As
> >> long as I can pass along *some* information from the switch driver to
> >> the tagging driver - namely the "allowance port mask" for a given bridge
> >> - I think I should be able to achieve full isolation between up to 7
> >> VLAN-aware bridges and with no restrictions on the number of VLANs per
> >> bridge, nor on the sharing of VLANs per bridge.
> >>
> >> Here is a quick summary of the relevant behaviour of the switch:
> >>
> >> VLANs programmed on the switch can be set to either SVL or IVL, on a
> >> per-VLAN basis. This affects how learned MAC addresses are searched
> >> for/saved in the hardware FDB:
> >>
> >>     - In SVL mode, the hardware FDB is keyed with {FID, MAC}.
> >>     - In IVL mode, the hardware FDB is keyed with {VID, MAC, EFID}.
> >>
> >> EFID stands for "Enhanced Filtering Identifier". The EFID is 3 bits.
> >>
> >> Unlike the FID - which is programmed per-VLAN - the EFID is programmed
> >> per-port. When a port has learning enabled and it receives an ingress
> >> frame with a given VID and MAC SA, it will search in the hardware FDB
> >> with a key {VID, MAC SA, EFID} - where EFID is the port EFID - and if
> >> the entry is not found, it will create a new one. This allows the switch
> >> to learn the same {VID, MAC SA} pair on two separate ports, provided
> >> those ports have different EFIDs.
> >
> > So you are basically saying that for FDB lookups, the EFID is like a
> > 3-bit extension of the 12-bit VID, practically emulating a 32K VLAN space?
>
> On a high level yes, you can look at it that way. In practice I think
> the picture is a bit more nuanced, and that enabling IVL on a particular
> VLAN just means that the ASIC will do the VLAN<->FID mapping
> automatically. However I was too eager to state "no restrictions on the
> number of VLANs [...]" because in fact the FIDs of this chip are only
> 4-bit. So if I use IVL, the limit will be 16 VLANs. However, I think the
> EFID strategy for FDB isolation still holds for SVL. Most of my
> understanding of the IVL/SVL feature is empirical, based on pushing
> packets and observing how the switch forwards them, and dumping the
> hardware FDB.

So let me rephrase the facts which you've presented to make sure I get this right.

(a) The switch processes each frame in an internal 4-bit FID.

(b) Each VLAN (not {port, VLAN} pair) can be configured for SVL or IVL.
    When a packet is received, it is first classified to a VLAN, then
    the VLAN table is looked up, and the switch determines whether that
    VLAN is configured for SVL or IVL.

(c) If configured for SVL, the 4-bit internal FID is derived exclusively
    from the VLAN table entry.

(d) If configured for IVL, the ingress port's EFID is read, and the
    4-bit internal FID is derived from the {12-bit VID, 3-bit port EFID}
    squashed into a 4-bit number.

(e) The sum of internal FIDs in use does not exceed 16, regardless of
    whether SVL or IVL is used for a VID. Otherwise said, the FDB cannot
    be partitioned in more than 16 groups.

(f) The FDB is always looked up by {internal FID, MAC}.

How do you know that point (e) is true? If you add more than 16 VLANs
using IVL, is there any error? If the user can map a SVL VID to a FID
directly through the VLAN table, does that mean that the hardware
continuously remaps IVL {VID, EFID} VLANs to different FIDs, as FID
values keep getting used up by SVL? Can you make an IVL VID reuse the
same internal FID as an SVL VID? Can you make two IVL VIDs use the same
internal FID?

Anyway, this complicates things by quite a bit. The Linux bridge doesn't
really have an SVL/IVL knob. It assumes IVL. Where things will get
challenging is when you offload FDB entries with a given {VID, MAC DA},
what to do if you access the FDB by FID, but in fact there isn't a
bijective mapping between the VID and the FID? You keep reference counts
per FDB entry, such that when the user deletes a MAC DA from VID A, but
you also have that MAC DA in VID B, both of which map to the same FID,
you still keep the entry? And most importantly, do you see the FID bits
in the tagger in the receive path as well? Can you dump them for packets
classified to a FID in different ways, using IVL, SVL?

> It could be that my conclusions about "lookup by VID" as opposed to
> "lookup by FID" are wrong, but if it comes to that, I will just have to
> manually implement VID<->FID mapping in the driver.

And this is the second complication. Whatever VID<->FID mapping you make,
if it's not static, you'll need a lookup table in the tagging protocol
driver to translate the VID from the skb to a FID. Odd. Or maybe I'm wrong.

> > Practically are you saying that the switch loses the EFID information
> > between the ingress and the egress stage, since the destination port
> > mask is selected based on a key constructed with "don't care" in the EFID bits?
> > Strange.
>
> Strange indeed - and wrong! I just checked this again. The switch
> actually _does_ preserve the EFID for the second lookup when selecting
> the destination port mask, and this behaves as you would expect. My
> observation to the contrary was specifically for the case where there is
> no hit for the destination address, in which case the switch will
> _flood_ according to the VLAN and MAC DA, without regard for the EFID.
> This kind of makes sense, since the EFID is just a searching/learning
> look-up-table concept and is not related to flooding. OTOH there are
> flooding port mask registers where one can set for
> {uni,multi,broad}cast, but this configuration is independent of VLAN.

So flooding is indeed the miss action from the FDB, but I'm just
wondering, aren't the flood control registers replicated per FID in fact?