netdev - Re: commit 4c7ea3c0791e (net: dsa: mv88e6xxx: disable SA learning for DSA and CPU ports)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87k0s9kfms.fsf@waldekranz.com>
Date:   Tue, 19 Jan 2021 00:08:11 +0100
From:   Tobias Waldekranz <tobias@...dekranz.com>
To:     Rasmus Villemoes <rasmus.villemoes@...vas.dk>,
        Vladimir Oltean <olteanv@...il.com>
Cc:     Andrew Lunn <andrew@...n.ch>,
        Network Development <netdev@...r.kernel.org>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Horatiu Vultur <horatiu.vultur@...rochip.com>
Subject: Re: commit 4c7ea3c0791e (net: dsa: mv88e6xxx: disable SA learning for DSA and CPU ports)

On Mon, Jan 18, 2021 at 23:07, Rasmus Villemoes <rasmus.villemoes@...vas.dk> wrote:
> On 18/01/2021 22.19, Vladimir Oltean wrote:
>> On Sat, Jan 16, 2021 at 02:42:12AM +0100, Tobias Waldekranz wrote:
>>>> What I'm _really_ trying to do is to get my mv88e6250 to participate in
>>>> an MRP ring, which AFAICT will require that the master device's MAC gets
>>>> added as a static entry in the ATU: Otherwise, when the ring goes from
>>>> open to closed, I've seen the switch wrongly learn the node's own mac
>>>> address as being in the direction of one of the normal ports, which
>>>> obviously breaks all traffic. So if the topology is
>>>>
>>>>    M
>>>>  /   \
>>>> C1 *** C2
>>>>
>>>> with the link between C1 and C2 being broken, both M-C1 and M-C2 links
>>>> are in forwarding (hence learning) state, so when the C1-C2 link gets
>>>> reestablished, it will take at least one received test packet for M to
>>>> decide to put one of the ports in blocking state - by which time the
>>>> damage is done, and the ATU now has a broken entry for M's own mac address.
>> 
>> What hardware offload features do you need to use for MRP on mv88e6xxx?
>> If none, then considering that Tobias's bridge series may stall, I think
>> by far the easiest approach would be for DSA to detect that it can't
>> offload the bridge+MRP configuration, and keep all ports as standalone.
>> When in standalone mode, the ports don't offload any bridge flags, i.e.
>> they don't do address learning, and the only forwarding destination
>> allowed is the CPU. The only disadvantage is that this is software-based
>> forwarding.

Just put some context around how these protocols are typically deployed:
The ring is the backbone of the whole network and can span hundreds of
switches. Applications range from low-bandwidth process automation to
high-definition IPTV. But even when you can meet the throughput demands
with a CPU, your latency will be off the charts, far beyond what is
acceptable in most cases.

> Which would be an unacceptable regression for my customer's use case. We
> really need some ring redundancy protocol, while also having the switch
> act as, well, a switch and do most forwarding in hardware. We used to
> use ERPS with some gross out-of-tree patches to set up the switch as
> required (much of the same stuff we're discussing here).
>
> Then when MRP got added to the kernel, and apparently some switches with
> hardware support for that are in the pipeline somewhere, we decided to
> try to switch to that - newer revisions of the hardware might include an
> MRP-capable switch, but the existing hardware with the marvell switches
> would (with a kernel and userspace upgrade) be able to coexist with that
> newer hardware.
>
> I took it for granted that MRP had been tested with existing
> switches/switchdev/DSA, but AFAICT (Horatiu, correct me if I'm wrong),
> currently MRP only works with a software bridge and with some
> out-of-tree driver for some not-yet-released hardware? I think I've
> identified what is needed to make it work with mv88e6xxx (and likely
> also other switchdev switches):
>
> (1) the port state as set on the software bridge must be
> offloaded/synchronized to the switch.
>
> (2) the bridge's hardware address must be made a static entry in the
> switch's database to avoid the switch accidentally learning a wrong port
> for that when the ring becomes closed.

I do not know MRP well enough, but that sounds reasonable if the same SA
is used for control packets sent through both ports.

> (3) the cpu must be made the only recipient of frames with an MRP
> multicast DA, 01:15:e4:...

I would possibly add (4): MRP comes in different "profiles". Some of
them require sending test packets at ridiculously high frequencies (more
than 1kHz IIRC). I would guess that Microchip has a programmable packet
generator that they can use for such things. We could potentially solve
that on newer 6xxx chips with the built-in IMP, but that is perhaps a
bit pie in the sky :)

> For (1), I think the only thing we need is to agree on where in the
> stack we translate from MRP to STP, because the like-named states in the
> two protocols really do behave exactly the same, AFAICT. So it can be
> done all the way up in MRP, perhaps even by getting completely rid of
> the distinction, or anywhere down the notifier stack, towards the actual
> switch driver.

You should search the archives. I distinctly remember somebody bringing
up this point before MRP was merged. So there ought to be some reason
for the existence of SWITCHDEV_ATTR_ID_MRP_PORT_STATE.

> For (2), I still have to see how far Tobias' patches will get me, but at
> least there's some reason independent of MRP to do that.

Like Vladimir said, do not count on that implementation making
it. Though something similar hopefully will.