lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <lsq.1487198494.778808225@decadent.org.uk>
Date:   Wed, 15 Feb 2017 22:41:34 +0000
From:   Ben Hutchings <ben@...adent.org.uk>
To:     linux-kernel@...r.kernel.org, stable@...r.kernel.org
CC:     akpm@...ux-foundation.org, "Matan Barak" <matanb@...lanox.com>,
        "Tariq Toukan" <tariqt@...lanox.com>,
        "Jack Morgenstein" <jackm@....mellanox.co.il>,
        "David S. Miller" <davem@...emloft.net>
Subject: [PATCH 3.2 026/126] net/mlx4_core: Fix deadlock when switching
 between polling and event fw commands

3.2.85-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Jack Morgenstein <jackm@....mellanox.co.il>

commit a7e1f04905e5b2b90251974dddde781301b6be37 upstream.

When switching from polling-based fw commands to event-based fw
commands, there is a race condition which could cause a fw command
in another task to hang: that task will keep waiting for the polling
sempahore, but may never be able to acquire it. This is due to
mlx4_cmd_use_events, which "down"s the sempahore back to 0.

During driver initialization, this is not a problem, since no other
tasks which invoke FW commands are active.

However, there is a problem if the driver switches to polling mode
and then back to event mode during normal operation.

The "test_interrupts" feature does exactly that.
Running "ethtool -t <eth device> offline" causes the PF driver to
temporarily switch to polling mode, and then back to event mode.
(Note that for VF drivers, such switching is not performed).

Fix this by adding a read-write semaphore for protection when
switching between modes.

Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Signed-off-by: Jack Morgenstein <jackm@....mellanox.co.il>
Signed-off-by: Matan Barak <matanb@...lanox.com>
Signed-off-by: Tariq Toukan <tariqt@...lanox.com>
Signed-off-by: David S. Miller <davem@...emloft.net>
[bwh: Backported to 3.2: adjust context, indentation]
Signed-off-by: Ben Hutchings <ben@...adent.org.uk>
---
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -313,12 +313,18 @@ int __mlx4_cmd(struct mlx4_dev *dev, u64
 	       int out_is_imm, u32 in_modifier, u8 op_modifier,
 	       u16 op, unsigned long timeout)
 {
+	int ret;
+
+	down_read(&mlx4_priv(dev)->cmd.switch_sem);
 	if (mlx4_priv(dev)->cmd.use_events)
-		return mlx4_cmd_wait(dev, in_param, out_param, out_is_imm,
-				     in_modifier, op_modifier, op, timeout);
+		ret = mlx4_cmd_wait(dev, in_param, out_param, out_is_imm,
+				    in_modifier, op_modifier, op, timeout);
 	else
-		return mlx4_cmd_poll(dev, in_param, out_param, out_is_imm,
-				     in_modifier, op_modifier, op, timeout);
+		ret = mlx4_cmd_poll(dev, in_param, out_param, out_is_imm,
+				    in_modifier, op_modifier, op, timeout);
+
+	up_read(&mlx4_priv(dev)->cmd.switch_sem);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(__mlx4_cmd);
 
@@ -326,6 +332,7 @@ int mlx4_cmd_init(struct mlx4_dev *dev)
 {
 	struct mlx4_priv *priv = mlx4_priv(dev);
 
+	init_rwsem(&priv->cmd.switch_sem);
 	mutex_init(&priv->cmd.hcr_mutex);
 	sema_init(&priv->cmd.poll_sem, 1);
 	priv->cmd.use_events = 0;
@@ -372,6 +379,7 @@ int mlx4_cmd_use_events(struct mlx4_dev
 	if (!priv->cmd.context)
 		return -ENOMEM;
 
+	down_write(&priv->cmd.switch_sem);
 	for (i = 0; i < priv->cmd.max_cmds; ++i) {
 		priv->cmd.context[i].token = i;
 		priv->cmd.context[i].next  = i + 1;
@@ -390,6 +398,7 @@ int mlx4_cmd_use_events(struct mlx4_dev
 	--priv->cmd.token_mask;
 
 	priv->cmd.use_events = 1;
+	up_write(&priv->cmd.switch_sem);
 
 	down(&priv->cmd.poll_sem);
 
@@ -404,6 +413,7 @@ void mlx4_cmd_use_polling(struct mlx4_de
 	struct mlx4_priv *priv = mlx4_priv(dev);
 	int i;
 
+	down_write(&priv->cmd.switch_sem);
 	priv->cmd.use_events = 0;
 
 	for (i = 0; i < priv->cmd.max_cmds; ++i)
@@ -412,6 +422,7 @@ void mlx4_cmd_use_polling(struct mlx4_de
 	kfree(priv->cmd.context);
 
 	up(&priv->cmd.poll_sem);
+	up_write(&priv->cmd.switch_sem);
 }
 
 struct mlx4_cmd_mailbox *mlx4_alloc_cmd_mailbox(struct mlx4_dev *dev)
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -42,6 +42,7 @@
 #include <linux/timer.h>
 #include <linux/semaphore.h>
 #include <linux/workqueue.h>
+#include <linux/rwsem.h>
 
 #include <linux/mlx4/device.h>
 #include <linux/mlx4/driver.h>
@@ -190,6 +191,7 @@ struct mlx4_cmd {
 	struct mutex		hcr_mutex;
 	struct semaphore	poll_sem;
 	struct semaphore	event_sem;
+	struct rw_semaphore	switch_sem;
 	int			max_cmds;
 	spinlock_t		context_lock;
 	int			free_head;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ