[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 10 Nov 2020 20:28:57 -0800
From: Florian Fainelli <f.fainelli@...il.com>
To: Tobias Waldekranz <tobias@...dekranz.com>, andrew@...n.ch,
vivien.didelot@...il.com, olteanv@...il.com
Cc: netdev@...r.kernel.org
Subject: Re: [RFC PATCH 0/4] net: dsa: link aggregation support
On 10/27/2020 3:51 AM, Tobias Waldekranz wrote:
> This series starts by adding the generic support required to offload
> link aggregates to drivers built on top of the DSA subsystem. It then
> implements offloading for the mv88e6xxx driver, i.e. Marvell's
> LinkStreet family.
>
> Posting this as an RFC as there are some topics that I would like
> feedback on before going further with testing. Thus far I've done some
> basic tests to verify that:
>
> - A LAG can be used as a regular interface.
> - Bridging a LAG with other DSA ports works as expected.
> - Load balancing is done by the hardware, both in single- and
> multi-chip systems.
> - Load balancing is dynamically reconfigured when the state of
> individual links change.
>
> Testing as been done on two systems:
>
> 1. Single-chip system with one Peridot (6390X).
> 2. Multi-chip system with one Agate (6352) daisy-chained with an Opal
> (6097F).
>
> I would really appreciate feedback on the following:
>
> All LAG configuration is cached in `struct dsa_lag`s. I realize that
> the standard M.O. of DSA is to read back information from hardware
> when required. With LAGs this becomes very tricky though. For example,
> the change of a link state on one switch will require re-balancing of
> LAG hash buckets on another one, which in turn depends on the total
> number of active links in the LAG. Do you agree that this is
> motivated?
Yes, this makes sense, I did something quite similar in this branch
nearly 3 years ago, it was tested to the point where the switch was
programmed correctly but I had not configured the CPU port to support
2Gbits/sec (doh) to verify that we got 2x the desired throughput:
https://github.com/ffainelli/linux/commits/b53-bond
Your patch looks definitively more complete.
>
> The LAG driver ops all receive the LAG netdev as an argument when this
> information is already available through the port's lag pointer. This
> was done to match the way that the bridge netdev is passed to all VLAN
> ops even though it is in the port's bridge_dev. Is there a reason for
> this or should I just remove it from the LAG ops?
>
> At least on mv88e6xxx, the exact source port is not available when
> packets are received on the CPU. The way I see it, there are two ways
> around that problem:
>
> - Inject the packet directly on the LAG device (what this series
> does). Feels right because it matches all that we actually know; the
> packet came in on the LAG. It does complicate dsa_switch_rcv
> somewhat as we can no longer assume that skb->dev is a DSA port.
>
> - Inject the packet on "the designated port", i.e. some port in the
> LAG. This lets us keep the current Rx path untouched. The problem is
> that (a) the port would have to be dynamically updated to match the
> expectations of the LAG driver (team/bond) as links are
> enabled/disabled and (b) we would be presenting a lie because
> packets would appear to ingress on netdevs that they might not in
> fact have been physically received on.
>
> (mv88e6xxx) What is the policy regarding the use of DSA vs. EDSA? It
> seems like all chips capable of doing EDSA are using that, except for
> the Peridot.
>
> (mv88e6xxx) The cross-chip PVT changes required to allow a LAG to
> communicate with the other ports do not feel quite right, but I'm
> unsure about what the proper way of doing it would be. Any ideas?
>
> (mv88e6xxx) Marvell has historically used the idiosyncratic term
> "trunk" to refer to link aggregates. Somewhere around the Peridot they
> have switched and are now referring to the same registers/tables using
> the term "LAG". In this series I've stuck to using LAG for all generic
> stuff, and only used trunk for driver-internal functions. Do we want
> to rename everything to use the LAG nomenclature?
Yes please!
--
Florian
Powered by blists - more mailing lists