[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1758531671-819655-3-git-send-email-tariqt@nvidia.com>
Date: Mon, 22 Sep 2025 12:01:06 +0300
From: Tariq Toukan <tariqt@...dia.com>
To: Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Andrew Lunn <andrew+netdev@...n.ch>, "David
S. Miller" <davem@...emloft.net>
CC: Saeed Mahameed <saeedm@...dia.com>, Leon Romanovsky <leon@...nel.org>,
Tariq Toukan <tariqt@...dia.com>, Mark Bloch <mbloch@...dia.com>,
<netdev@...r.kernel.org>, <linux-rdma@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, Gal Pressman <gal@...dia.com>, Moshe Shemesh
<moshe@...dia.com>, Jianbo Liu <jianbol@...dia.com>
Subject: [PATCH net-next 2/7] net/mlx5e: Prevent entering switchdev mode with inconsistent netns
From: Jianbo Liu <jianbol@...dia.com>
When a PF enters switchdev mode, its netdevice becomes the uplink
representor but remains in its current network namespace. All other
representors (VFs, SFs) are created in the netns of the devlink
instance.
If the PF's netns has been moved and differs from the devlink's netns,
enabling switchdev mode would create a state where the OVS control
plane (ovs-vsctl) cannot manage the switch because the PF uplink
representor and the other representors are split across different
namespaces.
To prevent this inconsistent configuration, block the request to enter
switchdev mode if the PF netdevice's netns does not match the netns of
its devlink instance.
As part of this change, the PF's netns is first marked as immutable.
This prevents race conditions where the netns could be changed after
the check is performed but before the mode transition is complete, and
it aligns the PF's behavior with that of the final uplink representor.
Signed-off-by: Jianbo Liu <jianbol@...dia.com>
Reviewed-by: Cosmin Ratiu <cratiu@...dia.com>
Reviewed-by: Jiri Pirko <jiri@...dia.com>
Reviewed-by: Dragos Tatulea <dtatulea@...dia.com>
Signed-off-by: Tariq Toukan <tariqt@...dia.com>
---
.../mellanox/mlx5/core/eswitch_offloads.c | 33 +++++++++++++++++++
1 file changed, 33 insertions(+)
Previously submitted to net:
https://lore.kernel.org/all/1757939074-617281-3-git-send-email-tariqt@nvidia.com/
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index bc9838dc5bf8..ff6e0130de38 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -3772,6 +3772,29 @@ void mlx5_eswitch_unblock_mode(struct mlx5_core_dev *dev)
up_write(&esw->mode_lock);
}
+/* Returns false only when uplink netdev exists and its netns is different from
+ * devlink's netns. True for all others so entering switchdev mode is allowed.
+ */
+static bool mlx5_devlink_netdev_netns_immutable_set(struct devlink *devlink,
+ bool immutable)
+{
+ struct mlx5_core_dev *mdev = devlink_priv(devlink);
+ struct net_device *netdev;
+ bool ret;
+
+ netdev = mlx5_uplink_netdev_get(mdev);
+ if (!netdev)
+ return true;
+
+ rtnl_lock();
+ netdev->netns_immutable = immutable;
+ ret = net_eq(dev_net(netdev), devlink_net(devlink));
+ rtnl_unlock();
+
+ mlx5_uplink_netdev_put(mdev, netdev);
+ return ret;
+}
+
int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode,
struct netlink_ext_ack *extack)
{
@@ -3814,6 +3837,14 @@ int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode,
esw->eswitch_operation_in_progress = true;
up_write(&esw->mode_lock);
+ if (mode == DEVLINK_ESWITCH_MODE_SWITCHDEV &&
+ !mlx5_devlink_netdev_netns_immutable_set(devlink, true)) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Can't change E-Switch mode to switchdev when netdev net namespace has diverged from the devlink's.");
+ err = -EINVAL;
+ goto skip;
+ }
+
if (mode == DEVLINK_ESWITCH_MODE_LEGACY)
esw->dev->priv.flags |= MLX5_PRIV_FLAGS_SWITCH_LEGACY;
mlx5_eswitch_disable_locked(esw);
@@ -3832,6 +3863,8 @@ int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode,
}
skip:
+ if (mode == DEVLINK_ESWITCH_MODE_SWITCHDEV && err)
+ mlx5_devlink_netdev_netns_immutable_set(devlink, false);
down_write(&esw->mode_lock);
esw->eswitch_operation_in_progress = false;
unlock:
--
2.31.1
Powered by blists - more mailing lists