linux-kernel - Re: [PATCH net v1] net: dsa: lantiq_gswip: Fix FDB add/remove on the CPU port

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220702185652.dpzrxuitacqp6m3t@skbuf>
Date:   Sat, 2 Jul 2022 21:56:52 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     Martin Blumenstingl <martin.blumenstingl@...glemail.com>
Cc:     Hauke Mehrtens <hauke@...ke-m.de>, netdev@...r.kernel.org,
        andrew@...n.ch, vivien.didelot@...il.com, davem@...emloft.net,
        edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
        linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH net v1] net: dsa: lantiq_gswip: Fix FDB add/remove on the
 CPU port

On Sat, Jul 02, 2022 at 07:43:11PM +0200, Martin Blumenstingl wrote:
> Hi Vladimir,
> 
> On Fri, Jul 1, 2022 at 3:02 PM Vladimir Oltean <olteanv@...il.com> wrote:
> [...]
> > > Use FID 0 (which is also the "default" FID) when adding/removing an FDB
> > > entry for the CPU port.
> >
> > What does "default" FID even mean, and why is the default FID relevant?
> The GSW140 datasheet [0] (which is for a newer IP than the one we are
> targeting currently with the GSWIP driver - but I am not aware of any
> older datasheets)

Thanks for the document! Really useful.

> page 78 mentions: "By default the FID is zero and all entries belong
> to shared VLAN learning."

Not talking about the hardware defaults when it's obvious the driver
changes those, in an attempt to comply to Linux networking expectations...

> > In any case, I recommend you to first set up a test bench where you
> > actually see a difference between packets being flooded to the CPU vs
> > matching an FDB entry targeting it. Then read up a bit what the provided
> > dsa_db argument wants from port_fdb_add(). This conversation with Alvin
> > should explain a few things.
> > https://patchwork.kernel.org/project/netdevbpf/cover/20220302191417.1288145-1-vladimir.oltean@nxp.com/#24763870
> I previously asked Hauke whether the RX tag (net/dsa/tag_gswip.c) has
> some bit to indicate whether traffic is flooded - but to his knowledge
> the switch doesn't provide this information.

Yeah, you generally won't find quite that level of detail even in more
advanced switches. Not that you need it...

> So I am not sure what I can do in this case - do you have any pointers for me?

Yes, I do.

gswip_setup has:

	/* Default unknown Broadcast/Multicast/Unicast port maps */
	gswip_switch_w(priv, BIT(cpu_port), GSWIP_PCE_PMAP1);
	gswip_switch_w(priv, BIT(cpu_port), GSWIP_PCE_PMAP2);
	gswip_switch_w(priv, BIT(cpu_port), GSWIP_PCE_PMAP3); <- replace BIT(cpu_port) with 0

If you can no longer ping, it means that flooding was how packets
reached the system.

It appears that what goes on is interesting.
The switch is configured to flood traffic that's unknown to the FDB only
to the CPU (notably not to other bridged ports).
In software, the packet reaches tag_gswip.c, where unlike the majority
of other DSA tagging protocols, we do not call dsa_default_offload_fwd_mark(skb).
Then, the packet reaches the software bridge, and the switch has
informed the bridge (via skb->offload_fwd_mark == 0) that the packet
hasn't been already flooded in hardware, so the software bridge needs to
do it (only if necessary, of course).

The software bridge floods the packet according to its own FDB. In your
case, the software bridge recognizes the MAC DA of the packet as being
equal to the MAC address of br0 itself, and so, it doesn't flood it,
just terminates it locally. This is true whether or not the switch
learned that address in its FDB on the CPU port.

> Also apologies if all of this is very obvious. So far I have only been
> working on the xMII part of Ethernet drivers, meaning: I am totally
> new to the FDB part.
> 
> > Then have a patch (set) lifting the "return -EINVAL" from gswip *properly*.
> > And only then do we get to ask the questions "how bad are things for
> > linux-5.18.y? how bad are they for linux-5.15.y? what do we need to do?".
> agreed
> 
> 
> Thanks again for your time and all these valuable hints Vladimir!
> Martin
> 
> 
> [0] https://assets.maxlinear.com/web/documents/617930_gsw140_ds_rev1.11.pdf

So if I'm right, the state of facts is quite "not broken" (quite the
other way around, I'm really impressed), although there are still
improvements to be made. Flooding could be offloaded to hardware, then
flooding to CPU could be turned off and controlled via port promiscuity.
This would save quite a few CPU cycles.