lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d4b590df-bc66-870f-2327-d207a3ee7134@mellanox.com>
Date:   Thu, 9 Jul 2020 18:34:25 +0000
From:   Parav Pandit <parav@...lanox.com>
To:     Niklas Schnelle <schnelle@...ux.ibm.com>,
        Saeed Mahameed <saeedm@...lanox.com>,
        Shay Drory <shayd@...lanox.com>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "ubraun@...ux.ibm.com" <ubraun@...ux.ibm.com>,
        "kgraul@...ux.ibm.com" <kgraul@...ux.ibm.com>,
        "raspl@...ibm.com" <raspl@...ibm.com>
Subject: Re: [REGRESSION] mlx5: Driver remove during hot unplug is broken

On 7/9/2020 3:36 PM, Niklas Schnelle wrote:
> 
> On 7/8/20 5:44 PM, Parav Pandit wrote:
> ... snip ..
>>>
>>
>> It is likely because events_cleanup() freed the memory using kvfree() that health recovery context is trying to access in notifier chain.
>>
>> While reviewing I see few more errors as below.
>> (a) mlx5_pci_err_detected() invokes mlx5_drain_health_wq() outside of intf_state_mutex.
>> (b) mlx5_enter_error_state() in commit b6e0b6bebe0 read and updates dev state outside of intf_state_mutex.
>> (c) due to drain_health_wq() introduction in mlx5_pci_close()  in commit 42ea9f1b5c625 health_wq is flushed twice.
>> (d) priv->events freed is kvfree() but allocated using kzalloc().
>>
>> This code certainly needs rework.
>>
>> In meantime to avoid this regression, I believe below hunk eliminates error introduced in patch 41798df9bfc.
>> Will you please help test it?
>>
>> Shay and I did the review of below patch.
>> If it works I will get it through Saeed's tree and internal reviews.
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>> index ebec2318dbc4..529df5703737 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>> @@ -785,11 +785,6 @@ static int mlx5_pci_init(struct mlx5_core_dev *dev, struct pci_dev *pdev,
>>
>>  static void mlx5_pci_close(struct mlx5_core_dev *dev)
>>  {
>> -       /* health work might still be active, and it needs pci bar in
>> -        * order to know the NIC state. Therefore, drain the health WQ
>> -        * before removing the pci bars
>> -        */
>> -       mlx5_drain_health_wq(dev);
>>         iounmap(dev->iseg);
>>         pci_clear_master(dev->pdev);
>>         release_bar(dev->pdev);
>> @@ -1235,6 +1230,7 @@ void mlx5_unload_one(struct mlx5_core_dev *dev, bool cleanup)
>>         if (cleanup) {
>>                 mlx5_unregister_device(dev);
>>                 mlx5_devlink_unregister(priv_to_devlink(dev));
>> +               mlx5_drain_health_wq(dev);
> I think with the above you can remove the mlx5_drain_health_wq(dev) in remove_one()
> as that calls mlx5_unload_one() with cleanup == true.
Oh yes, I was supposed to remove from there too.
That was the whole point of removing from there. But I missed. :(
My bad that I didn't give share patch.
Please find the inline patch at the end of the email thread.

> Also I wonder if it is a problem
> that the order of mlx5_devlink_unregister(priv_to_devlink(dev)) and mlx5_unregister_device(dev)
> is switched compared to the 5.7 code. That said changing both seems to result in a deadlock
> though not the "leak of a command resource":
> 
Current order in 5.8-rc4 of unregister_device and devlink_unregister is
the right one. That is, to unregister the netdevice before unregistering
devlink device.

Please do not change  the order of mlx5_unregister_device() and
mlx5_devlink_unregister().


>>
> As is the patch above fixes the dereference but results in the same completion error
> as current 5.8-rc4

Below patch should hopefully fix it. It fixes the bug introduced in
commit 41798df9bfca.
Additionally it fixes one small change of 42ea9f1b5c62.
Please remove all previous changes and apply below changes.
Hopefully this should resolve.


From a260b2e6a6065a57c2fa621271483cd51d0a1abf Mon Sep 17 00:00:00 2001
From: Parav Pandit <parav@...lanox.com>
Date: Thu, 9 Jul 2020 20:57:13 +0300
Subject: [PATCH] net/mlx5: Drain health wq after unregistering device

When health worker is disabled before unregistering the devices, and if
the firmware error is triggered while in middle of unregstering the
device, it leads to very long command timeout.

This error occurs because unregistering device involves several firmware
commands, and health recovery worker is disabled. So device state is
never updated as error.

    cpu-0                                  cpu-1
    -----                                  -----
remove_one()
  mlx5_drain_health_wq()                 poll_health()
  mlx5_unregister_device()                mlx5_trigger_health_work()
     mlx5_cmd_exec()                        /* health work skipped */
      /* timeout */

Hence, disable the health worker after unregistering the device and
before fully unloading the device, as health worker uses the device
during health recovery. Both such contexts are serialized by
intf_state_mutex.

Change-Id: I054df865726af42ea940300b439889e58d0956d1
Fixes: 41798df9bfca ("net/mlx5: Drain wq first during PCI device removal")
Fixes: 42ea9f1b5c62 ("net/mlx5: drain health workqueue in case of driver
load error")
Reported-by: Niklas Schnelle <schnelle@...ux.ibm.com>
Signed-off-by: Shay Drory <shayd@...lanox.com>
Signed-off-by: Parav Pandit <parav@...lanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/main.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c
b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index e32d46c33701..100a5c31ee0b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -785,11 +785,6 @@ static int mlx5_pci_init(struct mlx5_core_dev *dev,
struct pci_dev *pdev,

 static void mlx5_pci_close(struct mlx5_core_dev *dev)
 {
-	/* health work might still be active, and it needs pci bar in
-	 * order to know the NIC state. Therefore, drain the health WQ
-	 * before removing the pci bars
-	 */
-	mlx5_drain_health_wq(dev);
 	iounmap(dev->iseg);
 	pci_clear_master(dev->pdev);
 	release_bar(dev->pdev);
@@ -1230,6 +1225,7 @@ void mlx5_unload_one(struct mlx5_core_dev *dev,
bool cleanup)
 	if (cleanup) {
 		mlx5_unregister_device(dev);
 		mlx5_devlink_unregister(priv_to_devlink(dev));
+		mlx5_drain_health_wq(dev);
 	} else {
 		mlx5_detach_device(dev);
 	}
@@ -1361,6 +1357,11 @@ static int init_one(struct pci_dev *pdev, const
struct pci_device_id *id)
 	return 0;

 err_load_one:
+	/* health work might still be active, and it needs pci bar in
+	 * order to know the NIC state. Therefore, drain the health WQ
+	 * before removing the pci bars
+	 */
+	mlx5_drain_health_wq(dev);
 	mlx5_pci_close(dev);
 pci_init_err:
 	mlx5_mdev_uninit(dev);
@@ -1377,7 +1378,6 @@ static void remove_one(struct pci_dev *pdev)

 	devlink_reload_disable(devlink);
 	mlx5_crdump_disable(dev);
-	mlx5_drain_health_wq(dev);
 	mlx5_unload_one(dev, true);
 	mlx5_pci_close(dev);
 	mlx5_mdev_uninit(dev);
-- 
2.26.2

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ