lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251128103240.1723386-1-mukesh.ojha@oss.qualcomm.com>
Date: Fri, 28 Nov 2025 16:02:40 +0530
From: Mukesh Ojha <mukesh.ojha@....qualcomm.com>
To: Bjorn Andersson <andersson@...nel.org>,
        Mathieu Poirier <mathieu.poirier@...aro.org>
Cc: linux-arm-msm@...r.kernel.org, linux-remoteproc@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Mukesh Ojha <mukesh.ojha@....qualcomm.com>
Subject: [PATCH v4] remoteproc: qcom: Fix NULL pointer issue

There is a scenario, when fatal interrupt triggers rproc crash handling
while a user-space recovery is initiated in parallel. The overlapping
recovery/stop sequences race on rproc state and subdevice teardown,
resulting in a NULL pointer dereference in the GLINK SMEM unregister
path.

	Process-A                			Process-B

  fatal error interrupt happens

  rproc_crash_handler_work()
    mutex_lock_interruptible(&rproc->lock);
    ...

       rproc->state = RPROC_CRASHED;
    ...
    mutex_unlock(&rproc->lock);

    rproc_trigger_recovery()
     mutex_lock_interruptible(&rproc->lock);

      qcom_pas_stop()
      qcom_q6v5_pas 20c00000.remoteproc: failed to shutdown: -22
      remoteproc remoteproc3: can't stop rproc: -22
     mutex_unlock(&rproc->lock);

						echo enabled > /sys/class/remoteproc/remoteprocX/recovery
						recovery_store()
						 rproc_trigger_recovery()
						  mutex_lock_interruptible(&rproc->lock);
						   rproc_stop()
						    glink_subdev_stop()
						      qcom_glink_smem_unregister() ==|
                                                                                     |
                                                                                     V
						      Unable to handle kernel NULL pointer dereference
                                                                at virtual address 0000000000000358

It is tempting to introduce a remoteproc state that could be set from
the ->ops->stop() callback, which would have avoided the second attempt
and prevented the crash. However, making remoteproc recovery dependent
on manual intervention or a system reboot is not ideal. We should always
try to recover the remote processor if possible. A failure in the
->ops->stop() callback might be temporary or caused by a timeout, and a
recovery attempt could still succeed, as seen in similar scenarios.
Therefore, instead of adding a restrictive state, let’s add a NULL check
at the appropriate places to avoid a kernel crash and allow the system
to move forward gracefully.

Signed-off-by: Mukesh Ojha <mukesh.ojha@....qualcomm.com>
---
Changes in v4: https://lore.kernel.org/all/20241016045546.2613436-1-quic_mojha@quicinc.com/
 - Brought the same change from v2.
 - Added smd->edge NULL check.
 - Rephrased the commit text.

Changes in v3:
 - Fix kernel test reported error.

Changes in v2: https://lore.kernel.org/lkml/20240925103351.1628788-1-quic_mojha@quicinc.com/
 - Removed NULL pointer check instead added a new state to signify
   non-recoverable state of remoteproc.

 drivers/remoteproc/qcom_common.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/remoteproc/qcom_common.c b/drivers/remoteproc/qcom_common.c
index 8c8688f99f0a..6480293d2f61 100644
--- a/drivers/remoteproc/qcom_common.c
+++ b/drivers/remoteproc/qcom_common.c
@@ -209,6 +209,9 @@ static void glink_subdev_stop(struct rproc_subdev *subdev, bool crashed)
 {
 	struct qcom_rproc_glink *glink = to_glink_subdev(subdev);
 
+	if (!glink->edge)
+		return;
+
 	qcom_glink_smem_unregister(glink->edge);
 	glink->edge = NULL;
 }
@@ -320,6 +323,9 @@ static void smd_subdev_stop(struct rproc_subdev *subdev, bool crashed)
 {
 	struct qcom_rproc_subdev *smd = to_smd_subdev(subdev);
 
+	if (!smd->edge)
+		return;
+
 	qcom_smd_unregister_edge(smd->edge);
 	smd->edge = NULL;
 }
-- 
2.50.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ