[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241220081505.1286093-5-tariqt@nvidia.com>
Date: Fri, 20 Dec 2024 10:15:05 +0200
From: Tariq Toukan <tariqt@...dia.com>
To: "David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Eric Dumazet <edumazet@...gle.com>, "Andrew
Lunn" <andrew+netdev@...n.ch>
CC: <netdev@...r.kernel.org>, Saeed Mahameed <saeedm@...dia.com>, Gal Pressman
<gal@...dia.com>, Leon Romanovsky <leonro@...dia.com>, Mark Bloch
<mbloch@...dia.com>, Jianbo Liu <jianbol@...dia.com>, Tariq Toukan
<tariqt@...dia.com>
Subject: [PATCH net 4/4] net/mlx5e: Keep netdev when leave switchdev for devlink set legacy only
From: Jianbo Liu <jianbol@...dia.com>
In the cited commit, when changing from switchdev to legacy mode,
uplink representor's netdev is kept, and its profile is replaced with
nic profile, so netdev is detached from old profile, then attach to
new profile.
During profile change, the hardware resources allocated by the old
profile will be cleaned up. However, the cleanup is relying on the
related kernel modules. And they may need to flush themselves first,
which is triggered by netdev events, for example, NETDEV_UNREGISTER.
However, netdev is kept, or netdev_register is called after the
cleanup, which may cause troubles because the resources are still
referred by kernel modules.
The same process applies to all the caes when uplink is leaving
switchdev mode, including devlink eswitch mode set legacy, driver
unload and devlink reload. For the first one, it can be blocked and
returns failure to users, whenever possible. But it's hard for the
others. Besides, the attachment to nic profile is unnecessary as the
netdev will be unregistered anyway for such cases.
So in this patch, the original behavior is kept only for devlink
eswitch set mode legacy. For the others, moves netdev unregistration
before the profile change.
Fixes: 7a9fb35e8c3a ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode")
Signed-off-by: Jianbo Liu <jianbol@...dia.com>
Signed-off-by: Tariq Toukan <tariqt@...dia.com>
---
.../net/ethernet/mellanox/mlx5/core/en_main.c | 19 +++++++++++++++++--
.../net/ethernet/mellanox/mlx5/core/en_rep.c | 15 +++++++++++++++
.../mellanox/mlx5/core/eswitch_offloads.c | 2 ++
include/linux/mlx5/driver.h | 1 +
4 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index dd16d73000c3..0ec17c276bdd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -6542,8 +6542,23 @@ static void _mlx5e_remove(struct auxiliary_device *adev)
mlx5_core_uplink_netdev_set(mdev, NULL);
mlx5e_dcbnl_delete_app(priv);
- unregister_netdev(priv->netdev);
- _mlx5e_suspend(adev, false);
+ /* When unload driver, the netdev is in registered state
+ * if it's from legacy mode. If from switchdev mode, it
+ * is already unregistered before changing to NIC profile.
+ */
+ if (priv->netdev->reg_state == NETREG_REGISTERED) {
+ unregister_netdev(priv->netdev);
+ _mlx5e_suspend(adev, false);
+ } else {
+ struct mlx5_core_dev *pos;
+ int i;
+
+ if (test_bit(MLX5E_STATE_DESTROYING, &priv->state))
+ mlx5_sd_for_each_dev(i, mdev, pos)
+ mlx5e_destroy_mdev_resources(pos);
+ else
+ _mlx5e_suspend(adev, true);
+ }
/* Avoid cleanup if profile rollback failed. */
if (priv->profile)
priv->profile->cleanup(priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 554f9cb5b53f..fdff9fd8a89e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -1509,6 +1509,21 @@ mlx5e_vport_uplink_rep_unload(struct mlx5e_rep_priv *rpriv)
priv = netdev_priv(netdev);
+ /* This bit is set when using devlink to change eswitch mode from
+ * switchdev to legacy. As need to keep uplink netdev ifindex, we
+ * detach uplink representor profile and attach NIC profile only.
+ * The netdev will be unregistered later when unload NIC auxiliary
+ * driver for this case.
+ * We explicitly block devlink eswitch mode change if any IPSec rules
+ * offloaded, but can't block other cases, such as driver unload
+ * and devlink reload. We have to unregister netdev before profile
+ * change for those cases. This is to avoid resource leak because
+ * the offloaded rules don't have the chance to be unoffloaded before
+ * cleanup which is triggered by detach uplink representor profile.
+ */
+ if (!(priv->mdev->priv.flags & MLX5_PRIV_FLAGS_SWITCH_LEGACY))
+ unregister_netdev(netdev);
+
mlx5e_netdev_attach_nic_profile(priv);
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 40359f320724..06076dd9ec64 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -3777,6 +3777,8 @@ int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode,
esw->eswitch_operation_in_progress = true;
up_write(&esw->mode_lock);
+ if (mode == DEVLINK_ESWITCH_MODE_LEGACY)
+ esw->dev->priv.flags |= MLX5_PRIV_FLAGS_SWITCH_LEGACY;
mlx5_eswitch_disable_locked(esw);
if (mode == DEVLINK_ESWITCH_MODE_SWITCHDEV) {
if (mlx5_devlink_trap_get_num_active(esw->dev)) {
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index fc7e6153b73d..8f5991168ccd 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -524,6 +524,7 @@ enum {
* creation/deletion on drivers rescan. Unset during device attach.
*/
MLX5_PRIV_FLAGS_DETACH = 1 << 2,
+ MLX5_PRIV_FLAGS_SWITCH_LEGACY = 1 << 3,
};
struct mlx5_adev {
--
2.45.0
Powered by blists - more mailing lists