lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 19 May 2021 18:44:18 -0500
From:   Alex Elder <elder@...aro.org>
To:     ohad@...ery.com, bjorn.andersson@...aro.org,
        mathieu.poirier@...aro.org
Cc:     linux-remoteproc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [PATCH 1/1] remoteproc: use freezable workqueue for crash notifications

When a remoteproc has crashed, rproc_report_crash() is called to
handle whatever recovery is desired.  This can happen at almost any
time, often triggered by an interrupt, though it can also be
initiated by a write to debugfs file remoteproc/remoteproc*/crash.

When a crash is reported, the crash handler worker is scheduled to
run (rproc_crash_handler_work()).  One thing that worker does is
call rproc_trigger_recovery(), which calls rproc_stop().  That calls
the ->stop method for any remoteproc subdevices before making the
remote processor go offline.

The Q6V5 modem remoteproc driver implements an SSR subdevice that
notifies registered drivers when the modem changes operational state
(prepare, started, stop/crash, unprepared).  The IPA driver
registers to receive these notifications.

With that as context, I'll now describe the problem.

There was a situation in which buggy modem firmware led to a modem
crash very soon after system (AP) resume had begun.  The crash caused
a remoteproc SSR crash notification to be sent to the IPA driver.
The problem was that, although system resume had begun, it had not
yet completed, and the IPA driver was still in a suspended state.

This scenario could happen to any driver that registers for these
SSR notifications, because they are delivered without knowledge of
the (suspend) state of registered recipient drivers.

This patch offers a simple fix for this, by having the crash
handling worker function run on the system freezable workqueue.
This workqueue does not operate if user space is frozen (for
suspend).  As a result, the SSR subdevice only delivers its
crash notification when the system is fully operational (i.e.,
neither suspended nor in suspend/resume transition).

Signed-off-by: Alex Elder <elder@...aro.org>
---
 drivers/remoteproc/remoteproc_core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
index 39cf44cb08035..6bedf2d2af239 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -2724,8 +2724,8 @@ void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type)
 	dev_err(&rproc->dev, "crash detected in %s: type %s\n",
 		rproc->name, rproc_crash_to_string(type));
 
-	/* create a new task to handle the error */
-	schedule_work(&rproc->crash_handler);
+	/* Have a worker handle the error; ensure system is not suspended */
+	queue_work(system_freezable_wq, &rproc->crash_handler);
 }
 EXPORT_SYMBOL(rproc_report_crash);
 
-- 
2.27.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ