[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260120081654.1639138-3-tariqt@nvidia.com>
Date: Tue, 20 Jan 2026 10:16:52 +0200
From: Tariq Toukan <tariqt@...dia.com>
To: Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Andrew Lunn <andrew+netdev@...n.ch>, "David
S. Miller" <davem@...emloft.net>
CC: Saeed Mahameed <saeedm@...dia.com>, Leon Romanovsky <leon@...nel.org>,
Tariq Toukan <tariqt@...dia.com>, Mark Bloch <mbloch@...dia.com>,
<netdev@...r.kernel.org>, <linux-rdma@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, Gal Pressman <gal@...dia.com>, Moshe Shemesh
<moshe@...dia.com>, Cosmin Ratiu <cratiu@...dia.com>, Dragos Tatulea
<dtatulea@...dia.com>
Subject: [PATCH net 2/4] net/mlx5: Fix deadlock between devlink lock and esw->wq
From: Cosmin Ratiu <cratiu@...dia.com>
esw_functions_changed_event_handler -> esw_vfs_changed_event_handler is
called from the esw->work_queue and acquires the devlink lock.
Changing the esw mode is done via .eswitch_mode_set (acquires devlink
lock in the devlink_nl_pre_doit call) -> mlx5_devlink_eswitch_mode_set
-> mlx5_eswitch_disable_locked -> mlx5_eswitch_event_handler_unregister
-> flush_workqueue.
This creates a circular lock dependency which could lead to a real
deadlock, as the code flushing the workqueue is holding the devlink
lock, and the work handler being flushed could try to acquire it.
Fix that by adding a new bool field 'notifier_enabled' next to the event
handler scheduling the work, keeping it true while the notifier is
active, and using it to repeatedly try to acquire the devlink lock from
the work handler while true, with a slight delay to avoid busy looping.
This avoids the deadlock because the event handler will be removed
first (turning 'notifier_enabled' false), and the work handler will
eventually give up in acquiring the lock because the work is no longer
necessary.
Fixes: f1bc646c9a06 ("net/mlx5: Use devl_ API in mlx5_esw_offloads_devlink_port_register")
Signed-off-by: Cosmin Ratiu <cratiu@...dia.com>
Reviewed-by: Moshe Shemesh <moshe@...dia.com>
Reviewed-by: Dragos Tatulea <dtatulea@...dia.com>
Signed-off-by: Tariq Toukan <tariqt@...dia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 6 +++++-
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 1 +
.../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 12 +++++++++++-
3 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index 4b7a1ce7f406..fddc3b33222d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -1066,14 +1066,18 @@ static void mlx5_eswitch_event_handler_register(struct mlx5_eswitch *esw)
if (esw->mode == MLX5_ESWITCH_OFFLOADS && mlx5_eswitch_is_funcs_handler(esw->dev)) {
MLX5_NB_INIT(&esw->esw_funcs.nb, mlx5_esw_funcs_changed_handler,
ESW_FUNCTIONS_CHANGED);
+ esw->esw_funcs.notifier_enabled = true;
mlx5_eq_notifier_register(esw->dev, &esw->esw_funcs.nb);
}
}
static void mlx5_eswitch_event_handler_unregister(struct mlx5_eswitch *esw)
{
- if (esw->mode == MLX5_ESWITCH_OFFLOADS && mlx5_eswitch_is_funcs_handler(esw->dev))
+ if (esw->mode == MLX5_ESWITCH_OFFLOADS &&
+ mlx5_eswitch_is_funcs_handler(esw->dev)) {
+ esw->esw_funcs.notifier_enabled = false;
mlx5_eq_notifier_unregister(esw->dev, &esw->esw_funcs.nb);
+ }
flush_workqueue(esw->work_queue);
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index ad1073f7b79f..e20574a197e4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -338,6 +338,7 @@ struct mlx5_host_work {
struct mlx5_esw_functions {
struct mlx5_nb nb;
+ bool notifier_enabled;
bool host_funcs_disabled;
u16 num_vfs;
u16 num_ec_vfs;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index ea94a727633f..0199bea2cb31 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -3597,7 +3597,17 @@ esw_vfs_changed_event_handler(struct mlx5_eswitch *esw, const u32 *out)
return;
devlink = priv_to_devlink(esw->dev);
- devl_lock(devlink);
+ /* Repeatedly try to grab the lock with a delay while this work is
+ * still relevant.
+ * This allows a concurrent mlx5_eswitch_event_handler_unregister
+ * (holding the devlink lock) to flush the wq without deadlocking.
+ */
+ while (!devl_trylock(devlink)) {
+ if (!esw->esw_funcs.notifier_enabled)
+ return;
+ schedule_timeout_interruptible(msecs_to_jiffies(10));
+ }
+
/* Number of VFs can only change from "0 to x" or "x to 0". */
if (esw->esw_funcs.num_vfs > 0) {
mlx5_eswitch_unload_vf_vports(esw, esw->esw_funcs.num_vfs);
--
2.44.0
Powered by blists - more mailing lists