lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon,  9 Dec 2019 17:32:49 +0100
From:   Marc Kleine-Budde <mkl@...gutronix.de>
To:     netdev@...r.kernel.org
Cc:     davem@...emloft.net, linux-can@...r.kernel.org,
        kernel@...gutronix.de, Sean Nyekjaer <sean@...nix.com>,
        Joakim Zhang <qiangqing.zhang@....com>,
        linux-stable <stable@...r.kernel.org>,
        Marc Kleine-Budde <mkl@...gutronix.de>
Subject: [PATCH 06/13] can: flexcan: fix possible deadlock and out-of-order reception after wakeup

From: Sean Nyekjaer <sean@...nix.com>

When suspending, and there is still CAN traffic on the interfaces the
flexcan immediately wakes the platform again. As it should :-). But it
throws this error msg:

[ 3169.378661] PM: noirq suspend of devices failed

On the way down to suspend the interface that throws the error message
calls flexcan_suspend() but fails to call flexcan_noirq_suspend(). That
means flexcan_enter_stop_mode() is called, but on the way out of suspend
the driver only calls flexcan_resume() and skips flexcan_noirq_resume(),
thus it doesn't call flexcan_exit_stop_mode(). This leaves the flexcan
in stop mode, and with the current driver it can't recover from this
even with a soft reboot, it requires a hard reboot.

This patch fixes the deadlock when using self wakeup, by calling
flexcan_exit_stop_mode() from flexcan_resume() instead of
flexcan_noirq_resume().

This also fixes another issue: CAN frames are received out-of-order in
first IRQ handler run after wakeup.

The problem is that the wakeup latency from frame reception to the IRQ
handler (where the CAN frames are sorted by timestamp) is much bigger
than the time stamp counter wrap around time. This means it's
impossible to sort the CAN frames by timestamp.

The reason is that the controller exits stop mode during noirq resume,
which means it receives frames immediately, but interrupt handling is
still not possible.

So exit stop mode during resume stage instead of noirq resume fixes this
issue.

Fixes: de3578c198c6 ("can: flexcan: add self wakeup support")
Signed-off-by: Sean Nyekjaer <sean@...nix.com>
Tested-by: Sean Nyekjaer <sean@...nix.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@....com>
Cc: linux-stable <stable@...r.kernel.org> # >= v5.0
Signed-off-by: Marc Kleine-Budde <mkl@...gutronix.de>
---
 drivers/net/can/flexcan.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
index a929cdda9ab2..b6f675a5e2d9 100644
--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -1722,6 +1722,9 @@ static int __maybe_unused flexcan_resume(struct device *device)
 		netif_start_queue(dev);
 		if (device_may_wakeup(device)) {
 			disable_irq_wake(dev->irq);
+			err = flexcan_exit_stop_mode(priv);
+			if (err)
+				return err;
 		} else {
 			err = pm_runtime_force_resume(device);
 			if (err)
@@ -1767,14 +1770,9 @@ static int __maybe_unused flexcan_noirq_resume(struct device *device)
 {
 	struct net_device *dev = dev_get_drvdata(device);
 	struct flexcan_priv *priv = netdev_priv(dev);
-	int err;
 
-	if (netif_running(dev) && device_may_wakeup(device)) {
+	if (netif_running(dev) && device_may_wakeup(device))
 		flexcan_enable_wakeup_irq(priv, false);
-		err = flexcan_exit_stop_mode(priv);
-		if (err)
-			return err;
-	}
 
 	return 0;
 }
-- 
2.24.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ