lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 4 Dec 2019 11:36:06 +0000
From:   Joakim Zhang <qiangqing.zhang@....com>
To:     "mkl@...gutronix.de" <mkl@...gutronix.de>,
        "sean@...nix.com" <sean@...nix.com>,
        "linux-can@...r.kernel.org" <linux-can@...r.kernel.org>
CC:     dl-linux-imx <linux-imx@....com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Joakim Zhang <qiangqing.zhang@....com>
Subject: [PATCH V3 1/6] can: flexcan: fix deadlock when using self wakeup

From: Sean Nyekjaer <sean@...nix.com>

When suspending, when there is still can traffic on the interfaces the
flexcan immediately wakes the platform again. As it should :-). But it
throws this error msg:
[ 3169.378661] PM: noirq suspend of devices failed

On the way down to suspend the interface that throws the error message does
call flexcan_suspend but fails to call flexcan_noirq_suspend. That means
flexcan_enter_stop_mode is called, but on the way out of suspend the driver
only calls flexcan_resume and skips flexcan_noirq_resume, thus it doesn't
call flexcan_exit_stop_mode. This leaves the flexcan in stop mode, and with
the current driver it can't recover from this even with a soft reboot, it
requires a hard reboot.

This patch can fix deadlock when using self wakeup, it happenes to be
able to fix another issue that frames out-of-order in first IRQ handler
run after wakeup.

In wakeup case, after system resume, frames received out-of-order in
first IRQ handler, the problem is wakeup latency from frame reception to
IRQ handler is much bigger than the counter overflow. This means it's
impossible to sort the CAN frames by timestamp. The reason is that controller
exits stop mode during noirq resume, then it can receive the frame immediately.
If noirq reusme stage consumes much time, it will extend interrupt response
time. So exit stop mode during resume stage instead of noirq resume can
fix this issue.

Fixes: de3578c198c6 ("can: flexcan: add self wakeup support")
Signed-off-by: Sean Nyekjaer <sean@...nix.com>
Tested-by: Sean Nyekjaer <sean@...nix.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@....com>
------
ChangeLog:
	V1->V2: *no change.

	V2->V3: *split wakeup interrupt ack into another patch.
---
 drivers/net/can/flexcan.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
index 2efa06119f68..99aae90c1cdd 100644
--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -1722,6 +1722,9 @@ static int __maybe_unused flexcan_resume(struct device *device)
 		netif_start_queue(dev);
 		if (device_may_wakeup(device)) {
 			disable_irq_wake(dev->irq);
+			err = flexcan_exit_stop_mode(priv);
+			if (err)
+				return err;
 		} else {
 			err = pm_runtime_force_resume(device);
 			if (err)
@@ -1767,14 +1770,9 @@ static int __maybe_unused flexcan_noirq_resume(struct device *device)
 {
 	struct net_device *dev = dev_get_drvdata(device);
 	struct flexcan_priv *priv = netdev_priv(dev);
-	int err;
 
-	if (netif_running(dev) && device_may_wakeup(device)) {
+	if (netif_running(dev) && device_may_wakeup(device))
 		flexcan_enable_wakeup_irq(priv, false);
-		err = flexcan_exit_stop_mode(priv);
-		if (err)
-			return err;
-	}
 
 	return 0;
 }
-- 
2.17.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ