netdev - Re: DSA: some questions regarding TX forwarding offload

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211007094707.wg24vgbf57cr76mi@skbuf>
Date:   Thu, 7 Oct 2021 09:47:07 +0000
From:   Vladimir Oltean <vladimir.oltean@....com>
To:     Alvin Šipraga <ALSI@...g-olufsen.dk>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Florian Fainelli <f.fainelli@...il.com>,
        Andrew Lunn <andrew@...n.ch>
Subject: Re: DSA: some questions regarding TX forwarding offload

On Wed, Oct 06, 2021 at 04:16:34PM +0000, Alvin Šipraga wrote:
> First, allow me to reproduce the relevant part of the datasheet here:
>
> | == Search and Learning
> |
> | = Search
> |
> | When a packet is received, the RTL8365MB-VC uses the destination MAC
> | address, Filtering Identifier (FID) and Enhanced Filtering Identifier
> | (EFID) to search the 2K-entry look-up table. The 48-bit MAC address,
> | 4-bit FID and 3-bit EFID use a hash algorithm, to calculate an 11-bit
> | index value. The RTL8365MB-VC uses the index to compare the packet MAC
> | address with the entries (MAC addresses) in the look-up table. This is
> | the ‘Address Search’. If the destination MAC address is not found, the
> | switch will broadcast the packet according to VLAN configuration.

Unless the wording is plain incorrect or does not cover all cases except
for the "default operation", I think this says that the LUT is always
indexed based on a hash of {48-bit MAC, 4-bit FID, 3-bit EFID}.
No mention of VID.

> |
> | = Learning
> |
> | The RTL8365MB-VC uses the source MAC address, FID, and EFID of the
> | incoming packet to hash into a 9-bit index. It then compares the source
> | MAC address with the data (MAC addresses) in this index. If there is a
> | match with one of the entries, the RTL8365MB-VC will update the entry
> | with new information. If there is no match and the 2K entries are not
> | all occupied by other MAC addresses, the RTL8365MB- VC will record the
> | source MAC address and ingress port number into an empty entry. This
> | process is called ‘Learning’.
> | Address aging is used to keep the contents of the address table correct
> | in a dynamic network topology. The look-up engine will update the time
> | stamp information of an entry whenever the corresponding source MAC
> | address appears. An entry will be invalid (aged out) if its time stamp
> | information is not refreshed by the address learning process during the
> | aging time period. The aging time of the RTL8365MB-VC is between 200 and
> | 400 seconds (typical is 300 seconds).
> |
> | == SVL and IVL/SVL
> |
> | The RTL8365MB-VC supports a 16-group Filtering Identifier (FID) for L2
> | search and learning. In default operation, all VLAN entries belong to
> | the same FID. This is called Shared VLAN Learning (SVL). If VLAN entries
> | are configured to different FIDs, then the same source MAC address with
> | multiple FIDs can be learned into different look-up table entries. This
> | is called Independent VLAN Learning and Shared VLAN Learning (IVL/SVL).

I think I understand what they're trying to say, although I don't
understand what does "default operation" mean. Typical usage? No idea.

> This "IVL/SVL" mode would appear to correspond to a field in the vendor
> driver sources called ivl_svl (ivl_svl=1 is what I have referred to as
> "IVL" all this time), which is part of each VLAN configuration in the
> VLAN table. But that field also comes with a /* IVL */ or /* IVL_EN */
> comment next to it in some places. So I am unsure whether there is a
> third, "genuine" IVL mode which does not use the FID at all. At least,
> the description in the datasheet doesn't seem seem to correlate with the
> behaviour of this ivl_svl switch. But I could be parsing it wrong.

So you've said that a VLAN table entry contains an IVL_EN bit, and a FID.
It's this structure, right?

struct rtl8365mb_vlan_4k {
	u16 vid;
	u16 member;
	u16 untag;
	u8 fid;
	u8 priority;
	u8 priority_en : 1;
	u8 policing_en : 1;
	u8 ivl_en : 1;
	u8 meteridx;
};

What they say is: if you configure some of the VLAN table entries with
non-identical FIDs, you are operating in mixed IVL/SVL mode. Meaning:
you still haven't set the IVL bit in any of the VLAN table entries,
therefore you are still using SVL, where the VLAN table maps a VID to a
FID (and this is in line with the explanation given above).
But on the other hand, not all VLANs map to the same FID (as in the pure
SVL case). So it is a mixed SVL/IVL mode.

What I suspect is that if you set the IVL bit in the VLAN table entry,
the FID is completely ignored. Or maybe, with IVL, the VID _is_ the FID,
and in that case, the description above would actually be correct in
stating that the LUT is always looked up by {MAC, EFID, FID}.
What absolutely bugs me is the fact that they say the FID is 4-bit.
When using a 4K VLAN ID as FID, you can't use just 4 bits of it...

> Now, rather than speculate further on the semantics, I went ahead and
> tested out the behaviour by:
>
> - adding 32 VLANs 100..131 on a port, all with IVL (i.e. ivl_svl=1)
> - cycling through the 8 possible port EFIDs (0..7) on that port
> - for each EFID, sending one 802.1Q-tagged frame with VID=n for
> n=100..131 to the port from the network
>
> Some notes:
>
> - the chip supports up to 32 concurrent VLANs (globally); this is a
> general limitation of the hardware.
> - in this scenario the MAC SA is the same for each frame I transmit from
> the network into the port.
>
> By dumping the hardware FDB after the fact, I can see 32 * 8 FDB entries
> for the given MAC SA of my frames:
>
> 	cat /sys/kernel/debug/rtl8365mb/lut_dump
> 	addr    mac                     vid_fid spa     fid     efid
> 	0004    00:00:aa:aa:aa:aa       104     2       0       0
> 	0005    00:00:aa:aa:aa:aa       112     2       0       3
> 	0006    00:00:aa:aa:aa:aa       120     2       0       2
> 	0008    00:00:aa:aa:aa:aa       128     2       0       5
> 	0036    00:00:aa:aa:aa:aa       105     2       0       0
> 	0037    00:00:aa:aa:aa:aa       113     2       0       3
> 	0038    00:00:aa:aa:aa:aa       121     2       0       2
> 	0040    00:00:aa:aa:aa:aa       129     2       0       5
> 	0068    00:00:aa:aa:aa:aa       106     2       0       0
> 	... (table continues with an entry for each VID/EFID combo)
>
> Legend:
> 	addr: look-up-table index
> 	mac: MAC address
> 	vid_fid: VID of the frame for both IVL and SVL

Who gave it this "vid_fid" name?

> 	spa: source port address, i.e. the port that learned
> 	fid: FID (of the VLAN)
> 	efid: EFID (of the port)
>
> I also tried sending untagged frames from the network and cycling
> through one of the VLANs as PVID, in which case the port would learn and
> make an entry with vid_fid corresponding to the PVID.
>
> This suggests to me that the IVL field of the VLAN configuration really
> does achieve Independent VLAN learning, and that there are not many
> constraints here besides the size of the look-up-table.

Can you repeat the experiment sweeping through EFIDs, but with the VLANs
configured for SVL and having the same FID? I would expect that the LUT
indices will be different, but still as many. Just want to confirm my
theory that the EFID provides port-based isolation regardless of IVL_EN.

Also, can you please repeat the IVL experiment but with VIDs not having
consecutive values, but rather N, N+16, N+32, N+48, ... N+2048 etc?
I would like to get to the bottom of that 4-bit FID thing.

> Could it be that the ivl_svl switch simply controls how this
> look-up-table index is computed? That is to say:
>
> SVL: {FID, EFID, MAC} -> index
> IVL: {VID, EFID, MAC} -> index
>
> I tried the following scenario:
>
> 	# add VLAN 100/101
> 	bridge vlan add vid 100 dev swp2
> 	bridge vlan add vid 101 dev swp2
>
> 	# send VID 100 frame from another host on the network
> 	mausezahn eth2 -Q 100 -c 1 -a '00:00:aa:aa:aa:aa' -t tcp
>
> 	# dump HW FDB
> 	cat /sys/kernel/debug/rtl8365mb/lut_dump
>
> 	# send VID 101 frame this time
> 	mausezahn eth2 -Q 101 -c 1 -a '00:00:aa:aa:aa:aa' -t tcp
>
> 	# dump HW FDB
> 	cat /sys/kernel/debug/rtl8365mb/lut_dump
>
> I then tested this out with:
>
> - IVL, FID=0
>
> 	##### send frame on VLAN 100
> 	cat /sys/kernel/debug/rtl8365mb/lut_dump
> 	addr    mac                     vid_fid spa     fid     efid
> 	0388    00:00:aa:aa:aa:aa       100     2       0       0
> 	##### send frame on VLAN 101
> 	cat /sys/kernel/debug/rtl8365mb/lut_dump
> 	addr    mac                     vid_fid spa     fid     efid
> 	0388    00:00:aa:aa:aa:aa       100     2       0       0
> 	0420    00:00:aa:aa:aa:aa       101     2       0       0
>
> - IVL, FID=9
>
> 	##### send frame on VLAN 100
> 	cat /sys/kernel/debug/rtl8365mb/lut_dump
> 	addr    mac                     vid_fid spa     fid     efid
> 	0388    00:00:aa:aa:aa:aa       100     2       9       0
> 	##### send frame on VLAN 101
> 	cat /sys/kernel/debug/rtl8365mb/lut_dump
> 	addr    mac                     vid_fid spa     fid     efid
> 	0388    00:00:aa:aa:aa:aa       100     2       9       0
> 	0420    00:00:aa:aa:aa:aa       101     2       9       0
>
> - SVL, FID=0
>
> 	##### send frame on VLAN 100
> 	cat /sys/kernel/debug/rtl8365mb/lut_dump
> 	addr    mac                     vid_fid spa     fid     efid
> 	1280    00:00:aa:aa:aa:aa       100     2       0       0
> 	##### send frame on VLAN 101
> 	cat /sys/kernel/debug/rtl8365mb/lut_dump
> 	addr    mac                     vid_fid spa     fid     efid
> 	1280    00:00:aa:aa:aa:aa       101     2       0       0
>
> - SVL, FID=9
>
> 	##### send frame on VLAN 100
> 	cat /sys/kernel/debug/rtl8365mb/lut_dump
> 	addr    mac                     vid_fid spa     fid     efid
> 	1056    00:00:aa:aa:aa:aa       100     2       9       0
> 	##### send frame on VLAN 101
> 	cat /sys/kernel/debug/rtl8365mb/lut_dump
> 	addr    mac                     vid_fid spa     fid     efid
> 	1056    00:00:aa:aa:aa:aa       101     2       9       0
>
> Some observations:
>
> - with IVL, index is the same for FID=0,9
> - with SVL, index is different for FID=0,9
> - with IVL, index is different for VID=100,101
> - with SVL, index is the same for VID=100,101

Yes, good job investigating, this seems to support the theory that when
a VLAN table entry is configured for IVL, the FID (actually vid_fid in
your dumps) is the VID, otherwise it's the FID from the VLAN table entry.

> In particular, with IVL, the FID is stored in the table but it does not
> seem to affect the index.

It's probably there so that you don't need to flush the LUT and
reinstall everything when you change a VLAN table entry from IVL to SVL/IVL.

> I _think_ I can look up the FDB by VID, not just FID - I still have to
> confirm that but I think it depends on whether the particular VLAN is in
> IVL or SVL mode.
>
> But either way, there are bound to be collisions given the way the
> look-up-table works. If the driver is asked to offload two FDB entries
> which map to the same look-up-table entry (i.e. same index), can't it
> just error out on the second request? Something like "I see this entry
> is already occupied by a static (offloaded) FDB entry, so I can't
> satisfy this request".

Yes, in case of hash collisions between unrelated entries on a full row,
returning -ENOSPC is clearly okay. This case is more interesting because
the LUT entries are not unrelated. I was commenting under the assumption
that you will need to give switchdev the impression that you are
offloading entries via IVL (so you should accept two FDB entries for the
same MAC DA in different VIDs, as long as they point towards the same
destination port) because that's how the hardware is going to treat them.
The only problematic case is when switchdev asks one FDB in one VLAN to
go one way, and another in another VLAN to go another way.

[ by the way you can't propagate errors from .port_fdb_add to switchdev,
  and to the bridge, sorry ]

Anyway, doesn't matter, it's clearer now that you don't have to care
about this, I don't think you should use the SVL or SVL/IVL modes for
anything, just program all VLAN table entries with IVL=true, and set the
EFID based on dp->bridge_num.

> > And most importantly, do you see the FID bits in the tagger in the
> > receive path as well?
>
> No, I don't, which is kind of strange. But is it a problem?

Not really, no.