lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210720075354.u57sju7bvn5o3ses@soft-dev3-1.localhost>
Date:   Tue, 20 Jul 2021 09:53:54 +0200
From:   Horatiu Vultur <horatiu.vultur@...rochip.com>
To:     Vladimir Oltean <vladimir.oltean@....com>
CC:     <netdev@...r.kernel.org>, Jakub Kicinski <kuba@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Andrew Lunn <andrew@...n.ch>,
        Florian Fainelli <f.fainelli@...il.com>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Jiri Pirko <jiri@...nulli.us>,
        Ido Schimmel <idosch@...sch.org>,
        Tobias Waldekranz <tobias@...dekranz.com>,
        Roopa Prabhu <roopa@...dia.com>,
        Nikolay Aleksandrov <nikolay@...dia.com>,
        Stephen Hemminger <stephen@...workplumber.org>,
        <bridge@...ts.linux-foundation.org>,
        Grygorii Strashko <grygorii.strashko@...com>,
        Marek Behun <kabel@...ckhole.sk>,
        DENG Qingfang <dqfext@...il.com>,
        Vadym Kochan <vkochan@...vell.com>,
        Taras Chornyi <tchornyi@...vell.com>,
        Ioana Ciornei <ioana.ciornei@....com>,
        Lars Povlsen <lars.povlsen@...rochip.com>,
        Steen Hegelund <Steen.Hegelund@...rochip.com>,
        <UNGLinuxDriver@...rochip.com>,
        Claudiu Manoil <claudiu.manoil@....com>,
        Alexandre Belloni <alexandre.belloni@...tlin.com>
Subject: Re: [PATCH v4 net-next 09/15] net: bridge: switchdev: let drivers
 inform which bridge ports are offloaded

The 07/19/2021 00:44, Vladimir Oltean wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> On reception of an skb, the bridge checks if it was marked as 'already
> forwarded in hardware' (checks if skb->offload_fwd_mark == 1), and if it
> is, it assigns the source hardware domain of that skb based on the
> hardware domain of the ingress port. Then during forwarding, it enforces
> that the egress port must have a different hardware domain than the
> ingress one (this is done in nbp_switchdev_allowed_egress).
> 
> Non-switchdev drivers don't report any physical switch id (neither
> through devlink nor .ndo_get_port_parent_id), therefore the bridge
> assigns them a hardware domain of 0, and packets coming from them will
> always have skb->offload_fwd_mark = 0. So there aren't any restrictions.
> 
> Problems appear due to the fact that DSA would like to perform software
> fallback for bonding and team interfaces that the physical switch cannot
> offload.
> 
>        +-- br0 ---+
>       / /   |      \
>      / /    |       \
>     /  |    |      bond0
>    /   |    |     /    \
>  swp0 swp1 swp2 swp3 swp4
> 
> There, it is desirable that the presence of swp3 and swp4 under a
> non-offloaded LAG does not preclude us from doing hardware bridging
> beteen swp0, swp1 and swp2. The bandwidth of the CPU is often times high
> enough that software bridging between {swp0,swp1,swp2} and bond0 is not
> impractical.
> 
> But this creates an impossible paradox given the current way in which
> port hardware domains are assigned. When the driver receives a packet
> from swp0 (say, due to flooding), it must set skb->offload_fwd_mark to
> something.
> 
> - If we set it to 0, then the bridge will forward it towards swp1, swp2
>   and bond0. But the switch has already forwarded it towards swp1 and
>   swp2 (not to bond0, remember, that isn't offloaded, so as far as the
>   switch is concerned, ports swp3 and swp4 are not looking up the FDB,
>   and the entire bond0 is a destination that is strictly behind the
>   CPU). But we don't want duplicated traffic towards swp1 and swp2, so
>   it's not ok to set skb->offload_fwd_mark = 0.
> 
> - If we set it to 1, then the bridge will not forward the skb towards
>   the ports with the same switchdev mark, i.e. not to swp1, swp2 and
>   bond0. Towards swp1 and swp2 that's ok, but towards bond0? It should
>   have forwarded the skb there.
> 
> So the real issue is that bond0 will be assigned the same hardware
> domain as {swp0,swp1,swp2}, because the function that assigns hardware
> domains to bridge ports, nbp_switchdev_add(), recurses through bond0's
> lower interfaces until it finds something that implements devlink (calls
> dev_get_port_parent_id with bool recurse = true). This is a problem
> because the fact that bond0 can be offloaded by swp3 and swp4 in our
> example is merely an assumption.
> 
> A solution is to give the bridge explicit hints as to what hardware
> domain it should use for each port.
> 
> Currently, the bridging offload is very 'silent': a driver registers a
> netdevice notifier, which is put on the netns's notifier chain, and
> which sniffs around for NETDEV_CHANGEUPPER events where the upper is a
> bridge, and the lower is an interface it knows about (one registered by
> this driver, normally). Then, from within that notifier, it does a bunch
> of stuff behind the bridge's back, without the bridge necessarily
> knowing that there's somebody offloading that port. It looks like this:
> 
>      ip link set swp0 master br0
>                   |
>                   v
>  br_add_if() calls netdev_master_upper_dev_link()
>                   |
>                   v
>         call_netdevice_notifiers
>                   |
>                   v
>        dsa_slave_netdevice_event
>                   |
>                   v
>         oh, hey! it's for me!
>                   |
>                   v
>            .port_bridge_join
> 
> What we do to solve the conundrum is to be less silent, and change the
> switchdev drivers to present themselves to the bridge. Something like this:
> 
>      ip link set swp0 master br0
>                   |
>                   v
>  br_add_if() calls netdev_master_upper_dev_link()
>                   |
>                   v                    bridge: Aye! I'll use this
>         call_netdevice_notifiers           ^  ppid as the
>                   |                        |  hardware domain for
>                   v                        |  this port, and zero
>        dsa_slave_netdevice_event           |  if I got nothing.
>                   |                        |
>                   v                        |
>         oh, hey! it's for me!              |
>                   |                        |
>                   v                        |
>            .port_bridge_join               |
>                   |                        |
>                   +------------------------+
>              switchdev_bridge_port_offload(swp0, swp0)
> 
> Then stacked interfaces (like bond0 on top of swp3/swp4) would be
> treated differently in DSA, depending on whether we can or cannot
> offload them.
> 
> The offload case:
> 
>     ip link set bond0 master br0
>                   |
>                   v
>  br_add_if() calls netdev_master_upper_dev_link()
>                   |
>                   v                    bridge: Aye! I'll use this
>         call_netdevice_notifiers           ^  ppid as the
>                   |                        |  switchdev mark for
>                   v                        |        bond0.
>        dsa_slave_netdevice_event           | Coincidentally (or not),
>                   |                        | bond0 and swp0, swp1, swp2
>                   v                        | all have the same switchdev
>         hmm, it's not quite for me,        | mark now, since the ASIC
>          but my driver has already         | is able to forward towards
>            called .port_lag_join           | all these ports in hw.
>           for it, because I have           |
>       a port with dp->lag_dev == bond0.    |
>                   |                        |
>                   v                        |
>            .port_bridge_join               |
>            for swp3 and swp4               |
>                   |                        |
>                   +------------------------+
>             switchdev_bridge_port_offload(bond0, swp3)
>             switchdev_bridge_port_offload(bond0, swp4)
> 
> And the non-offload case:
> 
>     ip link set bond0 master br0
>                   |
>                   v
>  br_add_if() calls netdev_master_upper_dev_link()
>                   |
>                   v                    bridge waiting:
>         call_netdevice_notifiers           ^  huh, switchdev_bridge_port_offload
>                   |                        |  wasn't called, okay, I'll use a
>                   v                        |  hwdom of zero for this one.
>        dsa_slave_netdevice_event           :  Then packets received on swp0 will
>                   |                        :  not be software-forwarded towards
>                   v                        :  swp1, but they will towards bond0.
>          it's not for me, but
>        bond0 is an upper of swp3
>       and swp4, but their dp->lag_dev
>        is NULL because they couldn't
>             offload it.
> 
> Basically we can draw the conclusion that the lowers of a bridge port
> can come and go, so depending on the configuration of lowers for a
> bridge port, it can dynamically toggle between offloaded and unoffloaded.
> Therefore, we need an equivalent switchdev_bridge_port_unoffload too.
> 
> This patch changes the way any switchdev driver interacts with the
> bridge. From now on, everybody needs to call switchdev_bridge_port_offload
> and switchdev_bridge_port_unoffload, otherwise the bridge will treat the
> port as non-offloaded and allow software flooding to other ports from
> the same ASIC.
> 
> Note that these functions lay the ground for a more complex handshake
> between switchdev drivers and the bridge in the future. During the
> info->linking == false path, switchdev_bridge_port_unoffload() is
> strategically put in the NETDEV_PRECHANGEUPPER notifier as opposed to
> NETDEV_CHANGEUPPER. The reason for this has to do with a future
> migration of the switchdev object replay helpers (br_*_replay) from a
> pull mode (completely initiated by the driver) to a semi-push mode (the
> bridge initiates the replay when the switchdev driver declares that it
> offloads a port). On deletion, the switchdev object replay helpers need
> the netdev adjacency lists to be valid, and that is only true in
> NETDEV_PRECHANGEUPPER. So we need to add trivial glue code to all
> drivers to handle a "pre bridge leave" event, and that is where we hook
> the switchdev_bridge_port_unoffload() call.
> 
> Cc: Vadym Kochan <vkochan@...vell.com>
> Cc: Taras Chornyi <tchornyi@...vell.com>
> Cc: Ioana Ciornei <ioana.ciornei@....com>
> Cc: Lars Povlsen <lars.povlsen@...rochip.com>
> Cc: Steen Hegelund <Steen.Hegelund@...rochip.com>
> Cc: UNGLinuxDriver@...rochip.com
> Cc: Claudiu Manoil <claudiu.manoil@....com>
> Cc: Alexandre Belloni <alexandre.belloni@...tlin.com>
> Cc: Grygorii Strashko <grygorii.strashko@...com>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
> ---
> v2->v3: patch is new
> v3->v4: added mlxsw_sp_port_pre_lag_leave() and mlxsw_sp_port_pre_vlan_leave()
> 
>  .../ethernet/freescale/dpaa2/dpaa2-switch.c   | 14 ++-
>  .../ethernet/marvell/prestera/prestera_main.c | 30 +++++-
>  .../marvell/prestera/prestera_switchdev.c     | 29 +++++-
>  .../marvell/prestera/prestera_switchdev.h     |  7 +-
>  .../net/ethernet/mellanox/mlxsw/spectrum.c    | 50 ++++++++++
>  .../net/ethernet/mellanox/mlxsw/spectrum.h    |  4 +
>  .../mellanox/mlxsw/spectrum_switchdev.c       | 13 ++-
>  .../microchip/sparx5/sparx5_switchdev.c       | 41 +++++++-
>  drivers/net/ethernet/mscc/ocelot_net.c        | 81 ++++++++++++++++
>  drivers/net/ethernet/rocker/rocker.h          |  6 +-
>  drivers/net/ethernet/rocker/rocker_main.c     | 30 +++++-
>  drivers/net/ethernet/rocker/rocker_ofdpa.c    | 36 ++++++-
>  drivers/net/ethernet/ti/am65-cpsw-nuss.c      | 28 +++++-
>  drivers/net/ethernet/ti/cpsw_new.c            | 26 ++++-
>  include/linux/if_bridge.h                     | 26 +++++
>  net/bridge/br_if.c                            | 13 +--
>  net/bridge/br_private.h                       | 13 +--
>  net/bridge/br_switchdev.c                     | 94 +++++++++++++++++--
>  net/dsa/port.c                                | 20 +++-
>  19 files changed, 506 insertions(+), 55 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> index 23798feb40b2..2cd8a38e4f30 100644
> --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> @@ -1930,7 +1930,7 @@ static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
>         if (err)
>                 goto err_egress_flood;
> 
> -       return 0;
> +       return switchdev_bridge_port_offload(netdev, netdev, extack);
> 
>  err_egress_flood:
>         dpaa2_switch_port_set_fdb(port_priv, NULL);
> @@ -1957,6 +1957,13 @@ static int dpaa2_switch_port_restore_rxvlan(struct net_device *vdev, int vid, vo
>         return dpaa2_switch_port_vlan_add(arg, vlan_proto, vid);
>  }
> 
> +static int dpaa2_switch_port_pre_bridge_leave(struct net_device *netdev,
> +                                             struct net_device *upper_dev,
> +                                             struct netlink_ext_ack *extack)
> +{
> +       return switchdev_bridge_port_unoffload(netdev, netdev, extack);
> +}
> +
>  static int dpaa2_switch_port_bridge_leave(struct net_device *netdev)
>  {
>         struct ethsw_port_priv *port_priv = netdev_priv(netdev);
> @@ -2078,6 +2085,11 @@ static int dpaa2_switch_port_netdevice_event(struct notifier_block *nb,
>                 if (err)
>                         goto out;
> 
> +               if (!info->linking)
> +                       err = dpaa2_switch_port_pre_bridge_leave(netdev,
> +                                                                upper_dev,
> +                                                                extack);
> +
>                 break;
>         case NETDEV_CHANGEUPPER:
>                 upper_dev = info->upper_dev;
> diff --git a/drivers/net/ethernet/marvell/prestera/prestera_main.c b/drivers/net/ethernet/marvell/prestera/prestera_main.c
> index 1f3c8cd6ced2..62a0ccfdbe6b 100644
> --- a/drivers/net/ethernet/marvell/prestera/prestera_main.c
> +++ b/drivers/net/ethernet/marvell/prestera/prestera_main.c
> @@ -540,7 +540,8 @@ static void prestera_lag_destroy(struct prestera_switch *sw,
>  }
> 
>  static int prestera_lag_port_add(struct prestera_port *port,
> -                                struct net_device *lag_dev)
> +                                struct net_device *lag_dev,
> +                                struct netlink_ext_ack *extack)
>  {
>         struct prestera_switch *sw = port->sw;
>         struct prestera_lag *lag;
> @@ -570,6 +571,21 @@ static int prestera_lag_port_add(struct prestera_port *port,
>         return 0;
>  }
> 
> +static int prestera_pre_lag_leave(struct prestera_port *port,
> +                                 struct net_device *lag_dev,
> +                                 struct netlink_ext_ack *extack)
> +{
> +       if (netif_is_bridge_port(lag_dev)) {
> +               struct net_device *br_dev;
> +
> +               br_dev = netdev_master_upper_dev_get(lag_dev);
> +
> +               return prestera_pre_bridge_port_leave(br_dev, port, extack);
> +       }
> +
> +       return 0;
> +}
> +
>  static int prestera_lag_port_del(struct prestera_port *port)
>  {
>         struct prestera_switch *sw = port->sw;
> @@ -758,17 +774,25 @@ static int prestera_netdev_port_event(struct net_device *lower,
>                 if (err)
>                         return err;
> 
> +               if (netif_is_bridge_master(upper) && !info->linking)
> +                       return prestera_pre_bridge_port_leave(upper, port,
> +                                                             extack);
> +               else if (netif_is_lag_master(upper) && !info->linking)
> +                       return prestera_pre_lag_leave(port, upper, extack);
> +
>                 break;
> 
>         case NETDEV_CHANGEUPPER:
>                 if (netif_is_bridge_master(upper)) {
>                         if (info->linking)
> -                               return prestera_bridge_port_join(upper, port);
> +                               return prestera_bridge_port_join(upper, port,
> +                                                                extack);
>                         else
>                                 prestera_bridge_port_leave(upper, port);
>                 } else if (netif_is_lag_master(upper)) {
>                         if (info->linking)
> -                               return prestera_lag_port_add(port, upper);
> +                               return prestera_lag_port_add(port, upper,
> +                                                            extack);
>                         else
>                                 prestera_lag_port_del(port);
>                 }
> diff --git a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
> index a1fc4ab53ccf..4be82c043991 100644
> --- a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
> +++ b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c
> @@ -480,7 +480,8 @@ prestera_bridge_1d_port_join(struct prestera_bridge_port *br_port)
>  }
> 
>  int prestera_bridge_port_join(struct net_device *br_dev,
> -                             struct prestera_port *port)
> +                             struct prestera_port *port,
> +                             struct netlink_ext_ack *extack)
>  {
>         struct prestera_switchdev *swdev = port->sw->swdev;
>         struct prestera_bridge_port *br_port;
> @@ -500,6 +501,10 @@ int prestera_bridge_port_join(struct net_device *br_dev,
>                 goto err_brport_create;
>         }
> 
> +       err = switchdev_bridge_port_offload(br_port->dev, port->dev, extack);
> +       if (err)
> +               goto err_brport_offload;
> +
>         if (bridge->vlan_enabled)
>                 return 0;
> 
> @@ -510,12 +515,34 @@ int prestera_bridge_port_join(struct net_device *br_dev,
>         return 0;
> 
>  err_port_join:
> +       switchdev_bridge_port_unoffload(br_port->dev, port->dev, extack);
> +err_brport_offload:
>         prestera_bridge_port_put(br_port);
>  err_brport_create:
>         prestera_bridge_put(bridge);
>         return err;
>  }
> 
> +int prestera_pre_bridge_port_leave(struct net_device *br_dev,
> +                                  struct prestera_port *port,
> +                                  struct netlink_ext_ack *extack)
> +{
> +       struct prestera_switchdev *swdev = port->sw->swdev;
> +       struct prestera_bridge_port *br_port;
> +       struct prestera_bridge *bridge;
> +
> +       bridge = prestera_bridge_by_dev(swdev, br_dev);
> +       if (!bridge)
> +               return -ENODEV;
> +
> +       br_port = __prestera_bridge_port_by_dev(bridge, port->dev);
> +       if (!br_port)
> +               return -ENODEV;
> +
> +       return switchdev_bridge_port_unoffload(br_port->dev, port->dev,
> +                                              extack);
> +}
> +
>  static void prestera_bridge_1q_port_leave(struct prestera_bridge_port *br_port)
>  {
>         struct prestera_port *port = netdev_priv(br_port->dev);
> diff --git a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.h b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.h
> index a91bc35d235f..05ec54e2af26 100644
> --- a/drivers/net/ethernet/marvell/prestera/prestera_switchdev.h
> +++ b/drivers/net/ethernet/marvell/prestera/prestera_switchdev.h
> @@ -8,7 +8,12 @@ int prestera_switchdev_init(struct prestera_switch *sw);
>  void prestera_switchdev_fini(struct prestera_switch *sw);
> 
>  int prestera_bridge_port_join(struct net_device *br_dev,
> -                             struct prestera_port *port);
> +                             struct prestera_port *port,
> +                             struct netlink_ext_ack *extack);
> +
> +int prestera_pre_bridge_port_leave(struct net_device *br_dev,
> +                                  struct prestera_port *port,
> +                                  struct netlink_ext_ack *extack);
> 
>  void prestera_bridge_port_leave(struct net_device *br_dev,
>                                 struct prestera_port *port);
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
> index b3d1fdc2d094..05ff4936cf5d 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
> @@ -3811,6 +3811,21 @@ static int mlxsw_sp_port_lag_join(struct mlxsw_sp_port *mlxsw_sp_port,
>         return err;
>  }
> 
> +static int mlxsw_sp_port_pre_lag_leave(struct mlxsw_sp_port *mlxsw_sp_port,
> +                                      struct net_device *lag_dev,
> +                                      struct netlink_ext_ack *extack)
> +{
> +       struct net_device *br_dev;
> +
> +       if (!netif_is_bridge_port(lag_dev))
> +               return 0;
> +
> +       br_dev = netdev_master_upper_dev_get(lag_dev);
> +
> +       return mlxsw_sp_port_pre_bridge_leave(mlxsw_sp_port, lag_dev,
> +                                             br_dev, extack);
> +}
> +
>  static void mlxsw_sp_port_lag_leave(struct mlxsw_sp_port *mlxsw_sp_port,
>                                     struct net_device *lag_dev)
>  {
> @@ -3994,6 +4009,21 @@ static void mlxsw_sp_port_ovs_leave(struct mlxsw_sp_port *mlxsw_sp_port)
>         mlxsw_sp_port_vp_mode_set(mlxsw_sp_port, false);
>  }
> 
> +static int mlxsw_sp_port_pre_vlan_leave(struct mlxsw_sp_port *mlxsw_sp_port,
> +                                       struct net_device *vlan_dev,
> +                                       struct netlink_ext_ack *extack)
> +{
> +       struct net_device *br_dev;
> +
> +       if (!netif_is_bridge_port(vlan_dev))
> +               return 0;
> +
> +       br_dev = netdev_master_upper_dev_get(vlan_dev);
> +
> +       return mlxsw_sp_port_pre_bridge_leave(mlxsw_sp_port, vlan_dev,
> +                                             br_dev, extack);
> +}
> +
>  static void mlxsw_sp_port_vlan_leave(struct mlxsw_sp_port *mlxsw_sp_port,
>                                      struct net_device *vlan_dev)
>  {
> @@ -4208,6 +4238,20 @@ static int mlxsw_sp_netdevice_port_upper_event(struct net_device *lower_dev,
>                 if (err)
>                         return err;
> 
> +               if (netif_is_bridge_master(upper_dev) && !info->linking)
> +                       err = mlxsw_sp_port_pre_bridge_leave(mlxsw_sp_port,
> +                                                            lower_dev,
> +                                                            upper_dev,
> +                                                            extack);
> +               if (netif_is_lag_master(upper_dev) && !info->linking)
> +                       err = mlxsw_sp_port_pre_lag_leave(mlxsw_sp_port,
> +                                                         upper_dev,
> +                                                         extack);
> +               if (is_vlan_dev(upper_dev) && !info->linking)
> +                       err = mlxsw_sp_port_pre_vlan_leave(mlxsw_sp_port,
> +                                                          upper_dev,
> +                                                          extack);
> +
>                 break;
>         case NETDEV_CHANGEUPPER:
>                 upper_dev = info->upper_dev;
> @@ -4372,6 +4416,12 @@ static int mlxsw_sp_netdevice_port_vlan_event(struct net_device *vlan_dev,
>                 if (err)
>                         return err;
> 
> +               if (netif_is_bridge_master(upper_dev) && !info->linking)
> +                       err = mlxsw_sp_port_pre_bridge_leave(mlxsw_sp_port,
> +                                                            vlan_dev,
> +                                                            upper_dev,
> +                                                            extack);
> +
>                 break;
>         case NETDEV_CHANGEUPPER:
>                 upper_dev = info->upper_dev;
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
> index f99db88ee884..54d22ecee4e2 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
> @@ -596,6 +596,10 @@ int mlxsw_sp_port_bridge_join(struct mlxsw_sp_port *mlxsw_sp_port,
>                               struct net_device *brport_dev,
>                               struct net_device *br_dev,
>                               struct netlink_ext_ack *extack);
> +int mlxsw_sp_port_pre_bridge_leave(struct mlxsw_sp_port *mlxsw_sp_port,
> +                                  struct net_device *brport_dev,
> +                                  struct net_device *br_dev,
> +                                  struct netlink_ext_ack *extack);
>  void mlxsw_sp_port_bridge_leave(struct mlxsw_sp_port *mlxsw_sp_port,
>                                 struct net_device *brport_dev,
>                                 struct net_device *br_dev);
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
> index 5a0c7c94874e..731234a2ace3 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
> @@ -2367,6 +2367,7 @@ int mlxsw_sp_port_bridge_join(struct mlxsw_sp_port *mlxsw_sp_port,
>  {
>         struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
>         struct mlxsw_sp_bridge_device *bridge_device;
> +       struct net_device *dev = mlxsw_sp_port->dev;
>         struct mlxsw_sp_bridge_port *bridge_port;
>         int err;
> 
> @@ -2381,13 +2382,23 @@ int mlxsw_sp_port_bridge_join(struct mlxsw_sp_port *mlxsw_sp_port,
>         if (err)
>                 goto err_port_join;
> 
> -       return 0;
> +       return switchdev_bridge_port_offload(brport_dev, dev, extack);
> 
>  err_port_join:
>         mlxsw_sp_bridge_port_put(mlxsw_sp->bridge, bridge_port);
>         return err;
>  }
> 
> +int mlxsw_sp_port_pre_bridge_leave(struct mlxsw_sp_port *mlxsw_sp_port,
> +                                  struct net_device *brport_dev,
> +                                  struct net_device *br_dev,
> +                                  struct netlink_ext_ack *extack)
> +{
> +       struct net_device *dev = mlxsw_sp_port->dev;
> +
> +       return switchdev_bridge_port_unoffload(brport_dev, dev, extack);
> +}
> +
>  void mlxsw_sp_port_bridge_leave(struct mlxsw_sp_port *mlxsw_sp_port,
>                                 struct net_device *brport_dev,
>                                 struct net_device *br_dev)
> diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c b/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
> index a72e3b3b596e..270b9fabce91 100644
> --- a/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
> +++ b/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
> @@ -93,9 +93,11 @@ static int sparx5_port_attr_set(struct net_device *dev, const void *ctx,
>  }
> 
>  static int sparx5_port_bridge_join(struct sparx5_port *port,
> -                                  struct net_device *bridge)
> +                                  struct net_device *bridge,
> +                                  struct netlink_ext_ack *extack)
>  {
>         struct sparx5 *sparx5 = port->sparx5;
> +       struct net_device *ndev = port->ndev;
> 
>         if (bitmap_empty(sparx5->bridge_mask, SPX5_PORTS))
>                 /* First bridged port */
> @@ -112,9 +114,17 @@ static int sparx5_port_bridge_join(struct sparx5_port *port,
>         /* Port enters in bridge mode therefor don't need to copy to CPU
>          * frames for multicast in case the bridge is not requesting them
>          */
> -       __dev_mc_unsync(port->ndev, sparx5_mc_unsync);
> +       __dev_mc_unsync(ndev, sparx5_mc_unsync);
> 
> -       return 0;
> +       return switchdev_bridge_port_offload(ndev, ndev, extack);
> +}
> +
> +static int sparx5_port_pre_bridge_leave(struct sparx5_port *port,
> +                                       struct netlink_ext_ack *extack)
> +{
> +       struct net_device *ndev = port->ndev;
> +
> +       return switchdev_bridge_port_unoffload(ndev, ndev, extack);
>  }
> 
>  static void sparx5_port_bridge_leave(struct sparx5_port *port,
> @@ -135,15 +145,35 @@ static void sparx5_port_bridge_leave(struct sparx5_port *port,
>         __dev_mc_sync(port->ndev, sparx5_mc_sync, sparx5_mc_unsync);
>  }
> 
> +static int
> +sparx5_port_prechangeupper(struct net_device *dev,
> +                          struct netdev_notifier_changeupper_info *info)
> +{
> +       struct sparx5_port *port = netdev_priv(dev);
> +       struct netlink_ext_ack *extack;
> +       int err = 0;
> +
> +       extack = netdev_notifier_info_to_extack(&info->info);
> +
> +       if (netif_is_bridge_master(info->upper_dev) && !info->linking)
> +               err = sparx5_port_pre_bridge_leave(port, extack);
> +
> +       return err;
> +}
> +
>  static int sparx5_port_changeupper(struct net_device *dev,
>                                    struct netdev_notifier_changeupper_info *info)
>  {
>         struct sparx5_port *port = netdev_priv(dev);
> +       struct netlink_ext_ack *extack;
>         int err = 0;
> 
> +       extack = netdev_notifier_info_to_extack(&info->info);
> +
>         if (netif_is_bridge_master(info->upper_dev)) {
>                 if (info->linking)
> -                       err = sparx5_port_bridge_join(port, info->upper_dev);
> +                       err = sparx5_port_bridge_join(port, info->upper_dev,
> +                                                     extack);
>                 else
>                         sparx5_port_bridge_leave(port, info->upper_dev);
> 
> @@ -177,6 +207,9 @@ static int sparx5_netdevice_port_event(struct net_device *dev,
>                 return 0;
> 
>         switch (event) {
> +       case NETDEV_PRECHANGEUPPER:
> +               err = sparx5_port_prechangeupper(dev, ptr);
> +               break;
>         case NETDEV_CHANGEUPPER:
>                 err = sparx5_port_changeupper(dev, ptr);
>                 break;
> diff --git a/drivers/net/ethernet/mscc/ocelot_net.c b/drivers/net/ethernet/mscc/ocelot_net.c
> index e9d260d84bf3..dcb393a35c0e 100644
> --- a/drivers/net/ethernet/mscc/ocelot_net.c
> +++ b/drivers/net/ethernet/mscc/ocelot_net.c
> @@ -1216,6 +1216,10 @@ static int ocelot_netdevice_bridge_join(struct net_device *dev,
> 
>         ocelot_port_bridge_join(ocelot, port, bridge);
> 
> +       err = switchdev_bridge_port_offload(brport_dev, dev, extack);
> +       if (err)
> +               goto err_switchdev_offload;
> +
>         err = ocelot_switchdev_sync(ocelot, port, brport_dev, bridge, extack);
>         if (err)
>                 goto err_switchdev_sync;
> @@ -1223,10 +1227,19 @@ static int ocelot_netdevice_bridge_join(struct net_device *dev,
>         return 0;
> 
>  err_switchdev_sync:
> +       switchdev_bridge_port_unoffload(brport_dev, dev, extack);
> +err_switchdev_offload:
>         ocelot_port_bridge_leave(ocelot, port, bridge);
>         return err;
>  }
> 
> +static int ocelot_netdevice_pre_bridge_leave(struct net_device *dev,
> +                                            struct net_device *brport_dev,
> +                                            struct netlink_ext_ack *extack)
> +{
> +       return switchdev_bridge_port_unoffload(brport_dev, dev, extack);
> +}
> +
>  static int ocelot_netdevice_bridge_leave(struct net_device *dev,
>                                          struct net_device *brport_dev,
>                                          struct net_device *bridge)
> @@ -1279,6 +1292,19 @@ static int ocelot_netdevice_lag_join(struct net_device *dev,
>         return err;
>  }
> 
> +static int ocelot_netdevice_pre_lag_leave(struct net_device *dev,
> +                                         struct net_device *bond,
> +                                         struct netlink_ext_ack *extack)
> +{
> +       struct net_device *bridge_dev;
> +
> +       bridge_dev = netdev_master_upper_dev_get(bond);
> +       if (!bridge_dev || !netif_is_bridge_master(bridge_dev))
> +               return 0;
> +
> +       return ocelot_netdevice_pre_bridge_leave(dev, bond, extack);
> +}
> +
>  static int ocelot_netdevice_lag_leave(struct net_device *dev,
>                                       struct net_device *bond)
>  {
> @@ -1355,6 +1381,50 @@ ocelot_netdevice_lag_changeupper(struct net_device *dev,
>         return NOTIFY_DONE;
>  }
> 
> +static int
> +ocelot_netdevice_prechangeupper(struct net_device *dev,
> +                               struct net_device *brport_dev,
> +                               struct netdev_notifier_changeupper_info *info)
> +{
> +       struct netlink_ext_ack *extack;
> +       int err = 0;
> +
> +       extack = netdev_notifier_info_to_extack(&info->info);
> +
> +       if (netif_is_bridge_master(info->upper_dev) && !info->linking)
> +               err = ocelot_netdevice_pre_bridge_leave(dev, brport_dev,
> +                                                       extack);
> +
> +       if (netif_is_lag_master(info->upper_dev) && !info->linking)
> +               err = ocelot_netdevice_pre_lag_leave(dev, info->upper_dev,
> +                                                    extack);
> +
> +       return notifier_from_errno(err);
> +}
> +
> +static int
> +ocelot_netdevice_lag_prechangeupper(struct net_device *dev,
> +                                   struct netdev_notifier_changeupper_info *info)
> +{
> +       struct net_device *lower;
> +       struct list_head *iter;
> +       int err = NOTIFY_DONE;
> +
> +       netdev_for_each_lower_dev(dev, lower, iter) {
> +               struct ocelot_port_private *priv = netdev_priv(lower);
> +               struct ocelot_port *ocelot_port = &priv->port;
> +
> +               if (ocelot_port->bond != dev)
> +                       return NOTIFY_OK;
> +
> +               err = ocelot_netdevice_prechangeupper(dev, lower, info);
> +               if (err)
> +                       return notifier_from_errno(err);
> +       }
> +
> +       return NOTIFY_DONE;
> +}
> +
>  static int
>  ocelot_netdevice_changelowerstate(struct net_device *dev,
>                                   struct netdev_lag_lower_state_info *info)
> @@ -1382,6 +1452,17 @@ static int ocelot_netdevice_event(struct notifier_block *unused,
>         struct net_device *dev = netdev_notifier_info_to_dev(ptr);
> 
>         switch (event) {
> +       case NETDEV_PRECHANGEUPPER: {
> +               struct netdev_notifier_changeupper_info *info = ptr;
> +
> +               if (ocelot_netdevice_dev_check(dev))
> +                       return ocelot_netdevice_prechangeupper(dev, dev, info);
> +
> +               if (netif_is_lag_master(dev))
> +                       return ocelot_netdevice_lag_prechangeupper(dev, info);
> +
> +               break;
> +       }
>         case NETDEV_CHANGEUPPER: {
>                 struct netdev_notifier_changeupper_info *info = ptr;
> 
> diff --git a/drivers/net/ethernet/rocker/rocker.h b/drivers/net/ethernet/rocker/rocker.h
> index 315a6e5c0f59..d31cee1cdda9 100644
> --- a/drivers/net/ethernet/rocker/rocker.h
> +++ b/drivers/net/ethernet/rocker/rocker.h
> @@ -119,7 +119,11 @@ struct rocker_world_ops {
>         int (*port_obj_fdb_del)(struct rocker_port *rocker_port,
>                                 u16 vid, const unsigned char *addr);
>         int (*port_master_linked)(struct rocker_port *rocker_port,
> -                                 struct net_device *master);
> +                                 struct net_device *master,
> +                                 struct netlink_ext_ack *extack);
> +       int (*port_master_pre_unlink)(struct rocker_port *rocker_port,
> +                                     struct net_device *master,
> +                                     struct netlink_ext_ack *extack);
>         int (*port_master_unlinked)(struct rocker_port *rocker_port,
>                                     struct net_device *master);
>         int (*port_neigh_update)(struct rocker_port *rocker_port,
> diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c
> index a46633606cae..2e3e413406ac 100644
> --- a/drivers/net/ethernet/rocker/rocker_main.c
> +++ b/drivers/net/ethernet/rocker/rocker_main.c
> @@ -1670,13 +1670,25 @@ rocker_world_port_fdb_del(struct rocker_port *rocker_port,
>  }
> 
>  static int rocker_world_port_master_linked(struct rocker_port *rocker_port,
> -                                          struct net_device *master)
> +                                          struct net_device *master,
> +                                          struct netlink_ext_ack *extack)
>  {
>         struct rocker_world_ops *wops = rocker_port->rocker->wops;
> 
>         if (!wops->port_master_linked)
>                 return -EOPNOTSUPP;
> -       return wops->port_master_linked(rocker_port, master);
> +       return wops->port_master_linked(rocker_port, master, extack);
> +}
> +
> +static int rocker_world_port_master_pre_unlink(struct rocker_port *rocker_port,
> +                                              struct net_device *master,
> +                                              struct netlink_ext_ack *extack)
> +{
> +       struct rocker_world_ops *wops = rocker_port->rocker->wops;
> +
> +       if (!wops->port_master_pre_unlink)
> +               return -EOPNOTSUPP;
> +       return wops->port_master_pre_unlink(rocker_port, master, extack);
>  }
> 
>  static int rocker_world_port_master_unlinked(struct rocker_port *rocker_port,
> @@ -3107,6 +3119,7 @@ struct rocker_port *rocker_port_dev_lower_find(struct net_device *dev,
>  static int rocker_netdevice_event(struct notifier_block *unused,
>                                   unsigned long event, void *ptr)
>  {
> +       struct netlink_ext_ack *extack = netdev_notifier_info_to_extack(ptr);
>         struct net_device *dev = netdev_notifier_info_to_dev(ptr);
>         struct netdev_notifier_changeupper_info *info;
>         struct rocker_port *rocker_port;
> @@ -3116,6 +3129,16 @@ static int rocker_netdevice_event(struct notifier_block *unused,
>                 return NOTIFY_DONE;
> 
>         switch (event) {
> +       case NETDEV_PRECHANGEUPPER:
> +               info = ptr;
> +               if (!info->master)
> +                       goto out;
> +               rocker_port = netdev_priv(dev);
> +               if (!info->linking)
> +                       err = rocker_world_port_master_pre_unlink(rocker_port,
> +                                                                 info->upper_dev,
> +                                                                 extack);
> +               break;
>         case NETDEV_CHANGEUPPER:
>                 info = ptr;
>                 if (!info->master)
> @@ -3123,7 +3146,8 @@ static int rocker_netdevice_event(struct notifier_block *unused,
>                 rocker_port = netdev_priv(dev);
>                 if (info->linking) {
>                         err = rocker_world_port_master_linked(rocker_port,
> -                                                             info->upper_dev);
> +                                                             info->upper_dev,
> +                                                             extack);
>                         if (err)
>                                 netdev_warn(dev, "failed to reflect master linked (err %d)\n",
>                                             err);
> diff --git a/drivers/net/ethernet/rocker/rocker_ofdpa.c b/drivers/net/ethernet/rocker/rocker_ofdpa.c
> index 967a634ee9ac..c32d076bcbf6 100644
> --- a/drivers/net/ethernet/rocker/rocker_ofdpa.c
> +++ b/drivers/net/ethernet/rocker/rocker_ofdpa.c
> @@ -2571,8 +2571,10 @@ static int ofdpa_port_obj_fdb_del(struct rocker_port *rocker_port,
>  }
> 
>  static int ofdpa_port_bridge_join(struct ofdpa_port *ofdpa_port,
> -                                 struct net_device *bridge)
> +                                 struct net_device *bridge,
> +                                 struct netlink_ext_ack *extack)
>  {
> +       struct net_device *dev = ofdpa_port->dev;
>         int err;
> 
>         /* Port is joining bridge, so the internal VLAN for the
> @@ -2592,7 +2594,19 @@ static int ofdpa_port_bridge_join(struct ofdpa_port *ofdpa_port,
> 
>         ofdpa_port->bridge_dev = bridge;
> 
> -       return ofdpa_port_vlan_add(ofdpa_port, OFDPA_UNTAGGED_VID, 0);
> +       err = ofdpa_port_vlan_add(ofdpa_port, OFDPA_UNTAGGED_VID, 0);
> +       if (err)
> +               return err;
> +
> +       return switchdev_bridge_port_offload(dev, dev, extack);
> +}
> +
> +static int ofdpa_port_pre_bridge_leave(struct ofdpa_port *ofdpa_port,
> +                                      struct netlink_ext_ack *extack)
> +{
> +       struct net_device *dev = ofdpa_port->dev;
> +
> +       return switchdev_bridge_port_unoffload(dev, dev, extack);
>  }
> 
>  static int ofdpa_port_bridge_leave(struct ofdpa_port *ofdpa_port)
> @@ -2637,18 +2651,31 @@ static int ofdpa_port_ovs_changed(struct ofdpa_port *ofdpa_port,
>  }
> 
>  static int ofdpa_port_master_linked(struct rocker_port *rocker_port,
> -                                   struct net_device *master)
> +                                   struct net_device *master,
> +                                   struct netlink_ext_ack *extack)
>  {
>         struct ofdpa_port *ofdpa_port = rocker_port->wpriv;
>         int err = 0;
> 
>         if (netif_is_bridge_master(master))
> -               err = ofdpa_port_bridge_join(ofdpa_port, master);
> +               err = ofdpa_port_bridge_join(ofdpa_port, master, extack);
>         else if (netif_is_ovs_master(master))
>                 err = ofdpa_port_ovs_changed(ofdpa_port, master);
>         return err;
>  }
> 
> +static int ofdpa_port_master_pre_unlink(struct rocker_port *rocker_port,
> +                                       struct net_device *master,
> +                                       struct netlink_ext_ack *extack)
> +{
> +       struct ofdpa_port *ofdpa_port = rocker_port->wpriv;
> +       int err = 0;
> +
> +       if (netif_is_bridge_master(master))
> +               err = ofdpa_port_pre_bridge_leave(ofdpa_port, extack);
> +       return err;
> +}
> +
>  static int ofdpa_port_master_unlinked(struct rocker_port *rocker_port,
>                                       struct net_device *master)
>  {
> @@ -2800,6 +2827,7 @@ struct rocker_world_ops rocker_ofdpa_ops = {
>         .port_obj_fdb_add = ofdpa_port_obj_fdb_add,
>         .port_obj_fdb_del = ofdpa_port_obj_fdb_del,
>         .port_master_linked = ofdpa_port_master_linked,
> +       .port_master_pre_unlink = ofdpa_port_master_pre_unlink,
>         .port_master_unlinked = ofdpa_port_master_unlinked,
>         .port_neigh_update = ofdpa_port_neigh_update,
>         .port_neigh_destroy = ofdpa_port_neigh_destroy,
> diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
> index 718539cdd2f2..30e8b21dc6db 100644
> --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
> +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
> @@ -7,6 +7,7 @@
> 
>  #include <linux/clk.h>
>  #include <linux/etherdevice.h>
> +#include <linux/if_bridge.h>
>  #include <linux/if_vlan.h>
>  #include <linux/interrupt.h>
>  #include <linux/kernel.h>
> @@ -2077,10 +2078,13 @@ bool am65_cpsw_port_dev_check(const struct net_device *ndev)
>         return false;
>  }
> 
> -static int am65_cpsw_netdevice_port_link(struct net_device *ndev, struct net_device *br_ndev)
> +static int am65_cpsw_netdevice_port_link(struct net_device *ndev,
> +                                        struct net_device *br_ndev,
> +                                        struct netlink_ext_ack *extack)
>  {
>         struct am65_cpsw_common *common = am65_ndev_to_common(ndev);
>         struct am65_cpsw_ndev_priv *priv = am65_ndev_to_priv(ndev);
> +       int err;
> 
>         if (!common->br_members) {
>                 common->hw_bridge_dev = br_ndev;
> @@ -2092,6 +2096,10 @@ static int am65_cpsw_netdevice_port_link(struct net_device *ndev, struct net_dev
>                         return -EOPNOTSUPP;
>         }
> 
> +       err = switchdev_bridge_port_offload(ndev, ndev, extack);
> +       if (err)
> +               return err;
> +
>         common->br_members |= BIT(priv->port->port_id);
> 
>         am65_cpsw_port_offload_fwd_mark_update(common);
> @@ -2099,6 +2107,12 @@ static int am65_cpsw_netdevice_port_link(struct net_device *ndev, struct net_dev
>         return NOTIFY_DONE;
>  }
> 
> +static int am65_cpsw_netdevice_port_pre_unlink(struct net_device *ndev,
> +                                              struct netlink_ext_ack *extack)
> +{
> +       return switchdev_bridge_port_unoffload(ndev, ndev, extack);
> +}
> +
>  static void am65_cpsw_netdevice_port_unlink(struct net_device *ndev)
>  {
>         struct am65_cpsw_common *common = am65_ndev_to_common(ndev);
> @@ -2116,6 +2130,7 @@ static void am65_cpsw_netdevice_port_unlink(struct net_device *ndev)
>  static int am65_cpsw_netdevice_event(struct notifier_block *unused,
>                                      unsigned long event, void *ptr)
>  {
> +       struct netlink_ext_ack *extack = netdev_notifier_info_to_extack(ptr);
>         struct net_device *ndev = netdev_notifier_info_to_dev(ptr);
>         struct netdev_notifier_changeupper_info *info;
>         int ret = NOTIFY_DONE;
> @@ -2124,12 +2139,21 @@ static int am65_cpsw_netdevice_event(struct notifier_block *unused,
>                 return NOTIFY_DONE;
> 
>         switch (event) {
> +       case NETDEV_PRECHANGEUPPER:
> +               info = ptr;
> +
> +               if (netif_is_bridge_master(info->upper_dev) && !info->linking)
> +                       ret = am65_cpsw_netdevice_port_pre_unlink(ndev, extack);
> +
> +               break;
>         case NETDEV_CHANGEUPPER:
>                 info = ptr;
> 
>                 if (netif_is_bridge_master(info->upper_dev)) {
>                         if (info->linking)
> -                               ret = am65_cpsw_netdevice_port_link(ndev, info->upper_dev);
> +                               ret = am65_cpsw_netdevice_port_link(ndev,
> +                                                                   info->upper_dev,
> +                                                                   extack);
>                         else
>                                 am65_cpsw_netdevice_port_unlink(ndev);
>                 }
> diff --git a/drivers/net/ethernet/ti/cpsw_new.c b/drivers/net/ethernet/ti/cpsw_new.c
> index 57d279fdcc9f..8c586d1ff7d7 100644
> --- a/drivers/net/ethernet/ti/cpsw_new.c
> +++ b/drivers/net/ethernet/ti/cpsw_new.c
> @@ -11,6 +11,7 @@
>  #include <linux/module.h>
>  #include <linux/irqreturn.h>
>  #include <linux/interrupt.h>
> +#include <linux/if_bridge.h>
>  #include <linux/if_ether.h>
>  #include <linux/etherdevice.h>
>  #include <linux/net_tstamp.h>
> @@ -1499,10 +1500,12 @@ static void cpsw_port_offload_fwd_mark_update(struct cpsw_common *cpsw)
>  }
> 
>  static int cpsw_netdevice_port_link(struct net_device *ndev,
> -                                   struct net_device *br_ndev)
> +                                   struct net_device *br_ndev,
> +                                   struct netlink_ext_ack *extack)
>  {
>         struct cpsw_priv *priv = netdev_priv(ndev);
>         struct cpsw_common *cpsw = priv->cpsw;
> +       int err;
> 
>         if (!cpsw->br_members) {
>                 cpsw->hw_bridge_dev = br_ndev;
> @@ -1514,6 +1517,10 @@ static int cpsw_netdevice_port_link(struct net_device *ndev,
>                         return -EOPNOTSUPP;
>         }
> 
> +       err = switchdev_bridge_port_offload(ndev, ndev, extack);
> +       if (err)
> +               return err;
> +
>         cpsw->br_members |= BIT(priv->emac_port);
> 
>         cpsw_port_offload_fwd_mark_update(cpsw);
> @@ -1521,6 +1528,12 @@ static int cpsw_netdevice_port_link(struct net_device *ndev,
>         return NOTIFY_DONE;
>  }
> 
> +static int cpsw_netdevice_port_pre_unlink(struct net_device *ndev,
> +                                         struct netlink_ext_ack *extack)
> +{
> +       return switchdev_bridge_port_unoffload(ndev, ndev, extack);
> +}
> +
>  static void cpsw_netdevice_port_unlink(struct net_device *ndev)
>  {
>         struct cpsw_priv *priv = netdev_priv(ndev);
> @@ -1538,6 +1551,7 @@ static void cpsw_netdevice_port_unlink(struct net_device *ndev)
>  static int cpsw_netdevice_event(struct notifier_block *unused,
>                                 unsigned long event, void *ptr)
>  {
> +       struct netlink_ext_ack *extack = netdev_notifier_info_to_extack(ptr);
>         struct net_device *ndev = netdev_notifier_info_to_dev(ptr);
>         struct netdev_notifier_changeupper_info *info;
>         int ret = NOTIFY_DONE;
> @@ -1546,13 +1560,21 @@ static int cpsw_netdevice_event(struct notifier_block *unused,
>                 return NOTIFY_DONE;
> 
>         switch (event) {
> +       case NETDEV_PRECHANGEUPPER:
> +               info = ptr;
> +
> +               if (netif_is_bridge_master(info->upper_dev) && !info->linking)
> +                       ret = cpsw_netdevice_port_pre_unlink(ndev, extack);
> +
> +               break;
>         case NETDEV_CHANGEUPPER:
>                 info = ptr;
> 
>                 if (netif_is_bridge_master(info->upper_dev)) {
>                         if (info->linking)
>                                 ret = cpsw_netdevice_port_link(ndev,
> -                                                              info->upper_dev);
> +                                                              info->upper_dev,
> +                                                              extack);
>                         else
>                                 cpsw_netdevice_port_unlink(ndev);
>                 }
> diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
> index b651c5e32a28..d0bec83488b9 100644
> --- a/include/linux/if_bridge.h
> +++ b/include/linux/if_bridge.h
> @@ -206,4 +206,30 @@ static inline int br_fdb_replay(const struct net_device *br_dev,
>  }
>  #endif
> 
> +#if IS_ENABLED(CONFIG_BRIDGE) && IS_ENABLED(CONFIG_NET_SWITCHDEV)
> +
> +int switchdev_bridge_port_offload(struct net_device *brport_dev,
> +                                 struct net_device *dev,
> +                                 struct netlink_ext_ack *extack);
> +int switchdev_bridge_port_unoffload(struct net_device *brport_dev,
> +                                   struct net_device *dev,
> +                                   struct netlink_ext_ack *extack);
> +
> +#else
> +
> +static inline int switchdev_bridge_port_offload(struct net_device *brport_dev,
> +                                               struct net_device *dev,
> +                                               struct netlink_ext_ack *extack)
> +{
> +       return -EINVAL;
> +}
> +
> +static inline int switchdev_bridge_port_unoffload(struct net_device *brport_dev,
> +                                                 struct net_device *dev,
> +                                                 struct netlink_ext_ack *extack)
> +{
> +       return -EINVAL;
> +}
> +#endif
> +
>  #endif
> diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
> index c0df50e4abbb..86f6d7e93ea8 100644
> --- a/net/bridge/br_if.c
> +++ b/net/bridge/br_if.c
> @@ -349,7 +349,6 @@ static void del_nbp(struct net_bridge_port *p)
>         nbp_backup_clear(p);
> 
>         nbp_update_port_count(br);
> -       nbp_switchdev_del(p);
> 
>         netdev_upper_dev_unlink(dev, br->dev);
> 
> @@ -644,10 +643,6 @@ int br_add_if(struct net_bridge *br, struct net_device *dev,
>         if (err)
>                 goto err5;
> 
> -       err = nbp_switchdev_add(p);
> -       if (err)
> -               goto err6;
> -
>         dev_disable_lro(dev);
> 
>         list_add_rcu(&p->list, &br->port_list);
> @@ -685,13 +680,13 @@ int br_add_if(struct net_bridge *br, struct net_device *dev,
>                  */
>                 err = dev_pre_changeaddr_notify(br->dev, dev->dev_addr, extack);
>                 if (err)
> -                       goto err7;
> +                       goto err6;
>         }
> 
>         err = nbp_vlan_init(p, extack);
>         if (err) {
>                 netdev_err(dev, "failed to initialize vlan filtering on this port\n");
> -               goto err7;
> +               goto err6;
>         }
> 
>         spin_lock_bh(&br->lock);
> @@ -714,14 +709,12 @@ int br_add_if(struct net_bridge *br, struct net_device *dev,
> 
>         return 0;
> 
> -err7:
> +err6:
>         if (fdb_synced)
>                 br_fdb_unsync_static(br, p);
>         list_del_rcu(&p->list);
>         br_fdb_delete_by_port(br, p, 0, 1);
>         nbp_update_port_count(br);
> -       nbp_switchdev_del(p);
> -err6:
>         netdev_upper_dev_unlink(dev, br->dev);
>  err5:
>         dev->priv_flags &= ~IFF_BRIDGE_PORT;
> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> index a23c565b8970..46236302eed5 100644
> --- a/net/bridge/br_private.h
> +++ b/net/bridge/br_private.h
> @@ -335,6 +335,8 @@ struct net_bridge_port {
>          * hardware domain.
>          */
>         int                             hwdom;
> +       int                             offload_count;
> +       struct netdev_phys_item_id      ppid;
>  #endif
>         u16                             group_fwd_mask;
>         u16                             backup_redirected_cnt;
> @@ -1673,8 +1675,6 @@ void br_switchdev_fdb_notify(struct net_bridge *br,
>  int br_switchdev_port_vlan_add(struct net_device *dev, u16 vid, u16 flags,
>                                struct netlink_ext_ack *extack);
>  int br_switchdev_port_vlan_del(struct net_device *dev, u16 vid);
> -int nbp_switchdev_add(struct net_bridge_port *p);
> -void nbp_switchdev_del(struct net_bridge_port *p);
>  void br_switchdev_init(struct net_bridge *br);
> 
>  static inline void br_switchdev_frame_unmark(struct sk_buff *skb)
> @@ -1723,15 +1723,6 @@ static inline void br_switchdev_frame_unmark(struct sk_buff *skb)
>  {
>  }
> 
> -static inline int nbp_switchdev_add(struct net_bridge_port *p)
> -{
> -       return 0;
> -}
> -
> -static inline void nbp_switchdev_del(struct net_bridge_port *p)
> -{
> -}
> -
>  static inline void br_switchdev_init(struct net_bridge *br)
>  {
>  }
> diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
> index f3120f13c293..e335cbcc8ce5 100644
> --- a/net/bridge/br_switchdev.c
> +++ b/net/bridge/br_switchdev.c
> @@ -162,27 +162,101 @@ static void nbp_switchdev_hwdom_put(struct net_bridge_port *leaving)
>         clear_bit(leaving->hwdom, &br->busy_hwdoms);
>  }
> 
> -int nbp_switchdev_add(struct net_bridge_port *p)
> +static int nbp_switchdev_add(struct net_bridge_port *p,
> +                            struct netdev_phys_item_id ppid,
> +                            struct netlink_ext_ack *extack)
>  {
> -       struct netdev_phys_item_id ppid = { };
> -       int err;
> +       if (p->offload_count) {
> +               /* Prevent unsupported configurations such as a bridge port
> +                * which is a bonding interface, and the member ports are from
> +                * different hardware switches.
> +                */
> +               if (!netdev_phys_item_id_same(&p->ppid, &ppid)) {
> +                       NL_SET_ERR_MSG_MOD(extack,
> +                                          "Same bridge port cannot be offloaded by two physical switches");
> +                       return -EBUSY;
> +               }
> 
> -       ASSERT_RTNL();
> +               /* Tolerate drivers that call switchdev_bridge_port_offload()
> +                * more than once for the same bridge port, such as when the
> +                * bridge port is an offloaded bonding/team interface.
> +                */
> +               p->offload_count++;
> 
> -       err = dev_get_port_parent_id(p->dev, &ppid, true);
> -       if (err) {
> -               if (err == -EOPNOTSUPP)
> -                       return 0;
> -               return err;
> +               return 0;
>         }
> 
> +       p->ppid = ppid;
> +       p->offload_count = 1;
> +
>         return nbp_switchdev_hwdom_set(p);
>  }
> 
> -void nbp_switchdev_del(struct net_bridge_port *p)
> +static void nbp_switchdev_del(struct net_bridge_port *p,
> +                             struct netdev_phys_item_id ppid)
>  {
>         ASSERT_RTNL();
> 
> +       if (WARN_ON(!netdev_phys_item_id_same(&p->ppid, &ppid)))
> +               return;
> +
> +       if (WARN_ON(!p->offload_count))
> +               return;
> +
> +       p->offload_count--;
> +
> +       if (p->offload_count)
> +               return;
> +
>         if (p->hwdom)
>                 nbp_switchdev_hwdom_put(p);
>  }
> +
> +/* Let the bridge know that this port is offloaded, so that it can assign a
> + * switchdev hardware domain to it.
> + */
> +int switchdev_bridge_port_offload(struct net_device *brport_dev,
> +                                 struct net_device *dev,
> +                                 struct netlink_ext_ack *extack)
> +{
> +       struct netdev_phys_item_id ppid;
> +       struct net_bridge_port *p;
> +       int err;
> +
> +       ASSERT_RTNL();
> +
> +       p = br_port_get_rtnl(brport_dev);
> +       if (!p)
> +               return -ENODEV;
> +
> +       err = dev_get_port_parent_id(dev, &ppid, false);
> +       if (err)
> +               return err;
> +
> +       return nbp_switchdev_add(p, ppid, extack);
> +}
> +EXPORT_SYMBOL_GPL(switchdev_bridge_port_offload);
> +
> +int switchdev_bridge_port_unoffload(struct net_device *brport_dev,
> +                                   struct net_device *dev,
> +                                   struct netlink_ext_ack *extack)
> +{
> +       struct netdev_phys_item_id ppid;
> +       struct net_bridge_port *p;
> +       int err;
> +
> +       ASSERT_RTNL();
> +
> +       p = br_port_get_rtnl(dev);
> +       if (!p)
> +               return -ENODEV;
> +
> +       err = dev_get_port_parent_id(dev, &ppid, false);
> +       if (err)
> +               return err;
> +
> +       nbp_switchdev_del(p, ppid);
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL_GPL(switchdev_bridge_port_unoffload);
> diff --git a/net/dsa/port.c b/net/dsa/port.c
> index 28b45b7e66df..b824b6f8aa84 100644
> --- a/net/dsa/port.c
> +++ b/net/dsa/port.c
> @@ -292,6 +292,8 @@ int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br,
>                 .port = dp->index,
>                 .br = br,
>         };
> +       struct net_device *dev = dp->slave;
> +       struct net_device *brport_dev;
>         int err;
> 
>         /* Here the interface is already bridged. Reflect the current
> @@ -299,16 +301,24 @@ int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br,
>          */
>         dp->bridge_dev = br;
> 
> +       brport_dev = dsa_port_to_bridge_port(dp);
> +
>         err = dsa_broadcast(DSA_NOTIFIER_BRIDGE_JOIN, &info);
>         if (err)
>                 goto out_rollback;
> 
> -       err = dsa_port_switchdev_sync(dp, extack);
> +       err = switchdev_bridge_port_offload(brport_dev, dev, extack);
>         if (err)
>                 goto out_rollback_unbridge;
> 
> +       err = dsa_port_switchdev_sync(dp, extack);
> +       if (err)
> +               goto out_rollback_unoffload;
> +
>         return 0;
> 
> +out_rollback_unoffload:
> +       switchdev_bridge_port_unoffload(brport_dev, dev, extack);
>  out_rollback_unbridge:
>         dsa_broadcast(DSA_NOTIFIER_BRIDGE_LEAVE, &info);
>  out_rollback:
> @@ -319,6 +329,14 @@ int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br,
>  int dsa_port_pre_bridge_leave(struct dsa_port *dp, struct net_device *br,
>                               struct netlink_ext_ack *extack)
>  {
> +       struct net_device *brport_dev = dsa_port_to_bridge_port(dp);
> +       struct net_device *dev = dp->slave;
> +       int err;
> +
> +       err = switchdev_bridge_port_unoffload(brport_dev, dev, extack);
> +       if (err)
> +               return err;
> +
>         return dsa_port_switchdev_unsync_objs(dp, br, extack);
>  }
> 
> --
> 2.25.1
> 

Tested-by: Horatiu Vultur <horatiu.vultur@...rochip.com> # ocelot-switch

-- 
/Horatiu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ