lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 26 Jun 2022 17:11:39 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     Horatiu Vultur <horatiu.vultur@...rochip.com>
Cc:     linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        UNGLinuxDriver@...rochip.com, davem@...emloft.net,
        edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
        linux@...linux.org.uk
Subject: Re: [PATCH net-next 5/8] net: lan966x: Add lag support for lan966x.

Hi Horatiu,

Just casually browsing through the patches. A comment below.

On Sun, Jun 26, 2022 at 03:04:48PM +0200, Horatiu Vultur wrote:
> Add link aggregation hardware offload support for lan966x
> 
> Signed-off-by: Horatiu Vultur <horatiu.vultur@...rochip.com>
> ---
>  .../net/ethernet/microchip/lan966x/Makefile   |   2 +-
>  .../ethernet/microchip/lan966x/lan966x_lag.c  | 296 ++++++++++++++++++
>  .../ethernet/microchip/lan966x/lan966x_main.h |  28 ++
>  .../microchip/lan966x/lan966x_switchdev.c     |  78 ++++-
>  4 files changed, 388 insertions(+), 16 deletions(-)
>  create mode 100644 drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> 
> diff --git a/drivers/net/ethernet/microchip/lan966x/Makefile b/drivers/net/ethernet/microchip/lan966x/Makefile
> index fd2e0ebb2427..0c22c86bdaa9 100644
> --- a/drivers/net/ethernet/microchip/lan966x/Makefile
> +++ b/drivers/net/ethernet/microchip/lan966x/Makefile
> @@ -8,4 +8,4 @@ obj-$(CONFIG_LAN966X_SWITCH) += lan966x-switch.o
>  lan966x-switch-objs  := lan966x_main.o lan966x_phylink.o lan966x_port.o \
>  			lan966x_mac.o lan966x_ethtool.o lan966x_switchdev.o \
>  			lan966x_vlan.o lan966x_fdb.o lan966x_mdb.o \
> -			lan966x_ptp.o lan966x_fdma.o
> +			lan966x_ptp.o lan966x_fdma.o lan966x_lag.o
> diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c b/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> new file mode 100644
> index 000000000000..c721a05d44d2
> --- /dev/null
> +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> @@ -0,0 +1,296 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +
> +#include "lan966x_main.h"
> +
> +static void lan966x_lag_set_aggr_pgids(struct lan966x *lan966x)
> +{
> +	u32 visited = GENMASK(lan966x->num_phys_ports - 1, 0);
> +	int p, lag, i;
> +
> +	/* Reset destination and aggregation PGIDS */
> +	for (p = 0; p < lan966x->num_phys_ports; ++p)
> +		lan_wr(ANA_PGID_PGID_SET(BIT(p)),
> +		       lan966x, ANA_PGID(p));
> +
> +	for (p = PGID_AGGR; p < PGID_SRC; ++p)
> +		lan_wr(ANA_PGID_PGID_SET(visited),
> +		       lan966x, ANA_PGID(p));
> +
> +	/* The visited ports bitmask holds the list of ports offloading any
> +	 * bonding interface. Initially we mark all these ports as unvisited,
> +	 * then every time we visit a port in this bitmask, we know that it is
> +	 * the lowest numbered port, i.e. the one whose logical ID == physical
> +	 * port ID == LAG ID. So we mark as visited all further ports in the
> +	 * bitmask that are offloading the same bonding interface. This way,
> +	 * we set up the aggregation PGIDs only once per bonding interface.
> +	 */
> +	for (p = 0; p < lan966x->num_phys_ports; ++p) {
> +		struct lan966x_port *port = lan966x->ports[p];
> +
> +		if (!port || !port->bond)
> +			continue;
> +
> +		visited &= ~BIT(p);
> +	}
> +
> +	/* Now, set PGIDs for each active LAG */
> +	for (lag = 0; lag < lan966x->num_phys_ports; ++lag) {
> +		struct lan966x_port *port = lan966x->ports[lag];
> +		int num_active_ports = 0;
> +		struct net_device *bond;
> +		unsigned long bond_mask;
> +		u8 aggr_idx[16];
> +
> +		if (!port || !port->bond || (visited & BIT(lag)))
> +			continue;
> +
> +		bond = port->bond;
> +		bond_mask = lan966x_lag_get_mask(lan966x, bond, true);
> +
> +		for_each_set_bit(p, &bond_mask, lan966x->num_phys_ports) {
> +			lan_wr(ANA_PGID_PGID_SET(bond_mask),
> +			       lan966x, ANA_PGID(p));
> +			aggr_idx[num_active_ports++] = p;
> +		}

This incorrect logic seems to have been copied from ocelot from before
commit a14e6b69f393 ("net: mscc: ocelot: fix incorrect balancing with
down LAG ports").

The issue is that you calculate bond_mask with only_active_ports=true.
This means the for_each_set_bit() will not iterate through the inactive
LAG ports, and won't set the bond_mask as the PGID destination for those
ports.

That isn't what is desired; as explained in that commit, inactive LAG
ports should be removed via the aggregation PGIDs and not via the
destination PGIDs. Otherwise, an FDB entry targeted towards the
LAG (effectively towards the "primary" LAG port, whose logical port ID
gives the LAG ID) will not egress even the "secondary" LAG port if the
primary's link is down.

> +
> +		for (i = PGID_AGGR; i < PGID_SRC; ++i) {
> +			u32 ac;
> +
> +			ac = lan_rd(lan966x, ANA_PGID(i));
> +			ac &= ~bond_mask;
> +			/* Don't do division by zero if there was no active
> +			 * port. Just make all aggregation codes zero.
> +			 */
> +			if (num_active_ports)
> +				ac |= BIT(aggr_idx[i % num_active_ports]);
> +			lan_wr(ANA_PGID_PGID_SET(ac),
> +			       lan966x, ANA_PGID(i));
> +		}
> +
> +		/* Mark all ports in the same LAG as visited to avoid applying
> +		 * the same config again.
> +		 */
> +		for (p = lag; p < lan966x->num_phys_ports; p++) {
> +			struct lan966x_port *port = lan966x->ports[p];
> +
> +			if (!port)
> +				continue;
> +
> +			if (port->bond == bond)
> +				visited |= BIT(p);
> +		}
> +	}
> +}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ