[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cd717680-dbac-4329-75af-32d0c677d622@bang-olufsen.dk>
Date: Wed, 6 Oct 2021 16:16:34 +0000
From: Alvin Šipraga <ALSI@...g-olufsen.dk>
To: Vladimir Oltean <vladimir.oltean@....com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Florian Fainelli <f.fainelli@...il.com>,
Andrew Lunn <andrew@...n.ch>
Subject: Re: DSA: some questions regarding TX forwarding offload
On 10/5/21 5:25 PM, Vladimir Oltean wrote:
>
> So let me rephrase the facts which you've presented to make sure I get this right.
>
> (a) The switch processes each frame in an internal 4-bit FID.
>
> (b) Each VLAN (not {port, VLAN} pair) can be configured for SVL or IVL.
> When a packet is received, it is first classified to a VLAN, then
> the VLAN table is looked up, and the switch determines whether that
> VLAN is configured for SVL or IVL.
>
> (c) If configured for SVL, the 4-bit internal FID is derived exclusively
> from the VLAN table entry.
>
> (d) If configured for IVL, the ingress port's EFID is read, and the
> 4-bit internal FID is derived from the {12-bit VID, 3-bit port EFID}
> squashed into a 4-bit number.
>
> (e) The sum of internal FIDs in use does not exceed 16, regardless of
> whether SVL or IVL is used for a VID. Otherwise said, the FDB cannot
> be partitioned in more than 16 groups.
>
> (f) The FDB is always looked up by {internal FID, MAC}.
>
Hi Vladimir,
The idea that the chip maps everything to an (internal) 4-bit FID - even
in IVL mode - was just conjecture based on what I read in the datasheet
of the chip. I think you can see that I'm still a bit confused by this
hardware. I'm sorry if you feel like you wasted your time, but hopefully
this mail clarifies some things for you.
First, allow me to reproduce the relevant part of the datasheet here:
| == Search and Learning
|
| = Search
|
| When a packet is received, the RTL8365MB-VC uses the destination MAC
| address, Filtering Identifier (FID) and Enhanced Filtering Identifier
| (EFID) to search the 2K-entry look-up table. The 48-bit MAC address,
| 4-bit FID and 3-bit EFID use a hash algorithm, to calculate an 11-bit
| index value. The RTL8365MB-VC uses the index to compare the packet MAC
| address with the entries (MAC addresses) in the look-up table. This is
| the ‘Address Search’. If the destination MAC address is not found, the
| switch will broadcast the packet according to VLAN configuration.
|
| = Learning
|
| The RTL8365MB-VC uses the source MAC address, FID, and EFID of the
| incoming packet to hash into a 9-bit index. It then compares the source
| MAC address with the data (MAC addresses) in this index. If there is a
| match with one of the entries, the RTL8365MB-VC will update the entry
| with new information. If there is no match and the 2K entries are not
| all occupied by other MAC addresses, the RTL8365MB- VC will record the
| source MAC address and ingress port number into an empty entry. This
| process is called ‘Learning’.
|
| Address aging is used to keep the contents of the address table correct
| in a dynamic network topology. The look-up engine will update the time
| stamp information of an entry whenever the corresponding source MAC
| address appears. An entry will be invalid (aged out) if its time stamp
| information is not refreshed by the address learning process during the
| aging time period. The aging time of the RTL8365MB-VC is between 200 and
| 400 seconds (typical is 300 seconds).
|
| == SVL and IVL/SVL
|
| The RTL8365MB-VC supports a 16-group Filtering Identifier (FID) for L2
| search and learning. In default operation, all VLAN entries belong to
| the same FID. This is called Shared VLAN Learning (SVL). If VLAN entries
| are configured to different FIDs, then the same source MAC address with
| multiple FIDs can be learned into different look-up table entries. This
| is called Independent VLAN Learning and Shared VLAN Learning (IVL/SVL).
This "IVL/SVL" mode would appear to correspond to a field in the vendor
driver sources called ivl_svl (ivl_svl=1 is what I have referred to as
"IVL" all this time), which is part of each VLAN configuration in the
VLAN table. But that field also comes with a /* IVL */ or /* IVL_EN */
comment next to it in some places. So I am unsure whether there is a
third, "genuine" IVL mode which does not use the FID at all. At least,
the description in the datasheet doesn't seem seem to correlate with the
behaviour of this ivl_svl switch. But I could be parsing it wrong.
Now, rather than speculate further on the semantics, I went ahead and
tested out the behaviour by:
- adding 32 VLANs 100..131 on a port, all with IVL (i.e. ivl_svl=1)
- cycling through the 8 possible port EFIDs (0..7) on that port
- for each EFID, sending one 802.1Q-tagged frame with VID=n for
n=100..131 to the port from the network
Some notes:
- the chip supports up to 32 concurrent VLANs (globally); this is a
general limitation of the hardware.
- in this scenario the MAC SA is the same for each frame I transmit from
the network into the port.
By dumping the hardware FDB after the fact, I can see 32 * 8 FDB entries
for the given MAC SA of my frames:
cat /sys/kernel/debug/rtl8365mb/lut_dump
addr mac vid_fid spa fid efid
0004 00:00:aa:aa:aa:aa 104 2 0 0
0005 00:00:aa:aa:aa:aa 112 2 0 3
0006 00:00:aa:aa:aa:aa 120 2 0 2
0008 00:00:aa:aa:aa:aa 128 2 0 5
0036 00:00:aa:aa:aa:aa 105 2 0 0
0037 00:00:aa:aa:aa:aa 113 2 0 3
0038 00:00:aa:aa:aa:aa 121 2 0 2
0040 00:00:aa:aa:aa:aa 129 2 0 5
0068 00:00:aa:aa:aa:aa 106 2 0 0
... (table continues with an entry for each VID/EFID combo)
Legend:
addr: look-up-table index
mac: MAC address
vid_fid: VID of the frame for both IVL and SVL
spa: source port address, i.e. the port that learned
fid: FID (of the VLAN)
efid: EFID (of the port)
I also tried sending untagged frames from the network and cycling
through one of the VLANs as PVID, in which case the port would learn and
make an entry with vid_fid corresponding to the PVID.
This suggests to me that the IVL field of the VLAN configuration really
does achieve Independent VLAN learning, and that there are not many
constraints here besides the size of the look-up-table.
Now to address your questions...
> How do you know that point (e) is true?
Evidently it is not true, since I can partition the FDB into more than
16 groups.
> If you add more than 16 VLANs using IVL, is there any error?
I added 32 and things seem to work OK.
> If the user can map a SVL VID to a FID directly through the VLAN table,
> does that mean that the hardware continuously remaps IVL {VID, EFID}
> VLANs to different FIDs, as FID values keep getting used up by SVL?
This would be quite some gymnastics on the part of the ASIC. Let's take
a step back.
Could it be that the ivl_svl switch simply controls how this
look-up-table index is computed? That is to say:
SVL: {FID, EFID, MAC} -> index
IVL: {VID, EFID, MAC} -> index
I tried the following scenario:
# add VLAN 100/101
bridge vlan add vid 100 dev swp2
bridge vlan add vid 101 dev swp2
# send VID 100 frame from another host on the network
mausezahn eth2 -Q 100 -c 1 -a '00:00:aa:aa:aa:aa' -t tcp
# dump HW FDB
cat /sys/kernel/debug/rtl8365mb/lut_dump
# send VID 101 frame this time
mausezahn eth2 -Q 101 -c 1 -a '00:00:aa:aa:aa:aa' -t tcp
# dump HW FDB
cat /sys/kernel/debug/rtl8365mb/lut_dump
I then tested this out with:
- IVL, FID=0
##### send frame on VLAN 100
cat /sys/kernel/debug/rtl8365mb/lut_dump
addr mac vid_fid spa fid efid
0388 00:00:aa:aa:aa:aa 100 2 0 0
##### send frame on VLAN 101
cat /sys/kernel/debug/rtl8365mb/lut_dump
addr mac vid_fid spa fid efid
0388 00:00:aa:aa:aa:aa 100 2 0 0
0420 00:00:aa:aa:aa:aa 101 2 0 0
- IVL, FID=9
##### send frame on VLAN 100
cat /sys/kernel/debug/rtl8365mb/lut_dump
addr mac vid_fid spa fid efid
0388 00:00:aa:aa:aa:aa 100 2 9 0
##### send frame on VLAN 101
cat /sys/kernel/debug/rtl8365mb/lut_dump
addr mac vid_fid spa fid efid
0388 00:00:aa:aa:aa:aa 100 2 9 0
0420 00:00:aa:aa:aa:aa 101 2 9 0
- SVL, FID=0
##### send frame on VLAN 100
cat /sys/kernel/debug/rtl8365mb/lut_dump
addr mac vid_fid spa fid efid
1280 00:00:aa:aa:aa:aa 100 2 0 0
##### send frame on VLAN 101
cat /sys/kernel/debug/rtl8365mb/lut_dump
addr mac vid_fid spa fid efid
1280 00:00:aa:aa:aa:aa 101 2 0 0
- SVL, FID=9
##### send frame on VLAN 100
cat /sys/kernel/debug/rtl8365mb/lut_dump
addr mac vid_fid spa fid efid
1056 00:00:aa:aa:aa:aa 100 2 9 0
##### send frame on VLAN 101
cat /sys/kernel/debug/rtl8365mb/lut_dump
addr mac vid_fid spa fid efid
1056 00:00:aa:aa:aa:aa 101 2 9 0
Some observations:
- with IVL, index is the same for FID=0,9
- with SVL, index is different for FID=0,9
- with IVL, index is different for VID=100,101
- with SVL, index is the same for VID=100,101
In particular, with IVL, the FID is stored in the table but it does not
seem to affect the index.
> Can you make an IVL VID reuse the
> same internal FID as an SVL VID? Can you make two IVL VIDs use the same
> internal FID?
>
> Anyway, this complicates things by quite a bit. The Linux bridge doesn't
> really have an SVL/IVL knob. It assumes IVL. Where things will get
> challenging is when you offload FDB entries with a given {VID, MAC DA},
> what to do if you access the FDB by FID, but in fact there isn't a
> bijective mapping between the VID and the FID?
I _think_ I can look up the FDB by VID, not just FID - I still have to
confirm that but I think it depends on whether the particular VLAN is in
IVL or SVL mode.
But either way, there are bound to be collisions given the way the
look-up-table works. If the driver is asked to offload two FDB entries
which map to the same look-up-table entry (i.e. same index), can't it
just error out on the second request? Something like "I see this entry
is already occupied by a static (offloaded) FDB entry, so I can't
satisfy this request".
> You keep reference counts
> per FDB entry, such that when the user deletes a MAC DA from VID A, but
> you also have that MAC DA in VID B, both of which map to the same FID,
> you still keep the entry?
> And most importantly, do you see the FID bits
> in the tagger in the receive path as well?
No, I don't, which is kind of strange. But is it a problem?
> Can you dump them for packets
> classified to a FID in different ways, using IVL, SVL?
>
>> It could be that my conclusions about "lookup by VID" as opposed to
>> "lookup by FID" are wrong, but if it comes to that, I will just have to
>> manually implement VID<->FID mapping in the driver.
>
> And this is the second complication. Whatever VID<->FID mapping you make,
> if it's not static, you'll need a lookup table in the tagging protocol
> driver to translate the VID from the skb to a FID. Odd. Or maybe I'm wrong.
OK, I think these last questions of yours are based on the premise of
some kind of VID<->FID mapping. But I hope this email demonstrates to
you that the switch behaves somewhat differently.
>
>>> Practically are you saying that the switch loses the EFID information
>>> between the ingress and the egress stage, since the destination port
>>> mask is selected based on a key constructed with "don't care" in the EFID bits?
>>> Strange.
>>
>> Strange indeed - and wrong! I just checked this again. The switch
>> actually _does_ preserve the EFID for the second lookup when selecting
>> the destination port mask, and this behaves as you would expect. My
>> observation to the contrary was specifically for the case where there is
>> no hit for the destination address, in which case the switch will
>> _flood_ according to the VLAN and MAC DA, without regard for the EFID.
>> This kind of makes sense, since the EFID is just a searching/learning
>> look-up-table concept and is not related to flooding. OTOH there are
>> flooding port mask registers where one can set for
>> {uni,multi,broad}cast, but this configuration is independent of VLAN.
>
> So flooding is indeed the miss action from the FDB, but I'm just
> wondering, aren't the flood control registers replicated per FID in fact?
No, they seem to be global. Here's what the register definitions look
like in the vendor driver:
#define RTL8367C_REG_UNDA_FLOODING_PMSK 0x0890
#define RTL8367C_UNDA_FLOODING_PMSK_OFFSET 0
#define RTL8367C_UNDA_FLOODING_PMSK_MASK 0x7FF
#define RTL8367C_REG_UNMCAST_FLOADING_PMSK 0x0891
#define RTL8367C_UNMCAST_FLOADING_PMSK_OFFSET 0
#define RTL8367C_UNMCAST_FLOADING_PMSK_MASK 0x7FF
#define RTL8367C_REG_BCAST_FLOADING_PMSK 0x0892
#define RTL8367C_BCAST_FLOADING_PMSK_OFFSET 0
#define RTL8367C_BCAST_FLOADING_PMSK_MASK 0x7FF
Thanks for your help.
Alvin
Powered by blists - more mailing lists