lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2024081353-blah-reversion-1435@gregkh>
Date: Tue, 13 Aug 2024 11:37:14 +0200
From: Greg KH <gregkh@...uxfoundation.org>
To: Abhishek Singh <quic_abhishes@...cinc.com>
Cc: srinivas.kandagatla@...aro.org, linux-arm-msm@...r.kernel.org,
	quic_bkumar@...cinc.com, linux-kernel@...r.kernel.org,
	quic_ktadakam@...cinc.com, quic_chennak@...cinc.com,
	dri-devel@...ts.freedesktop.org
Subject: Re: [PATCH v1] misc: fastrpc: Trigger a panic using BUG_ON in device
 release

On Mon, Aug 05, 2024 at 04:36:28PM +0530, Abhishek Singh wrote:
> 
> On 7/30/2024 12:46 PM, Greg KH wrote:
> > On Tue, Jul 30, 2024 at 12:39:45PM +0530, Abhishek Singh wrote:
> >> The user process on ARM closes the device node while closing the
> >> session, triggers a remote call to terminate the PD running on the
> >> DSP. If the DSP is in an unstable state and cannot process the remote
> >> request from the HLOS, glink fails to deliver the kill request to the
> >> DSP, resulting in a timeout error. Currently, this error is ignored,
> >> and the session is closed, causing all the SMMU mappings associated
> >> with that specific PD to be removed. However, since the PD is still
> >> operational on the DSP, any attempt to access these SMMU mappings
> >> results in an SMMU fault, leading to a panic.  As the SMMU mappings
> >> have already been removed, there is no available information on the
> >> DSP to determine the root cause of its unresponsiveness to remote
> >> calls. As the DSP is unresponsive to all process remote calls, use
> >> BUG_ON to prevent the removal of SMMU mappings and to properly
> >> identify the root cause of the DSP’s unresponsiveness to the remote
> >> calls.
> >>
> >> Signed-off-by: Abhishek Singh <quic_abhishes@...cinc.com>
> >> ---
> >>  drivers/misc/fastrpc.c | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >>
> >> diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
> >> index 5204fda51da3..bac9c749564c 100644
> >> --- a/drivers/misc/fastrpc.c
> >> +++ b/drivers/misc/fastrpc.c
> >> @@ -97,6 +97,7 @@
> >>  #define FASTRPC_RMID_INIT_CREATE_STATIC	8
> >>  #define FASTRPC_RMID_INIT_MEM_MAP      10
> >>  #define FASTRPC_RMID_INIT_MEM_UNMAP    11
> >> +#define PROCESS_KILL_SC 0x01010000
> >>  
> >>  /* Protection Domain(PD) ids */
> >>  #define ROOT_PD		(0)
> >> @@ -1128,6 +1129,9 @@ static int fastrpc_invoke_send(struct fastrpc_session_ctx *sctx,
> >>  	fastrpc_context_get(ctx);
> >>  
> >>  	ret = rpmsg_send(cctx->rpdev->ept, (void *)msg, sizeof(*msg));
> >> +	/* trigger panic if glink communication is broken and the message is for PD kill */
> >> +	BUG_ON((ret == -ETIMEDOUT) && (handle == FASTRPC_INIT_HANDLE) &&
> >> +			(ctx->sc == PROCESS_KILL_SC));
> > 
> > You just crashed the machine completely, sorry, but no, properly handle
> > the issue and clean up if you can detect it, do not break systems.
> > 
> But the Glink communication with DSP is already broken; we cannot communicate with the DSP.
> The system will crash if we proceed with cleanup on the ARM side. If we don’t do cleanup,
> a resource leak will occur. Eventually, the system will become dead. That’s why I am
> crashing the device.

Then explicitly call panic() if you think you really want to shut the
system down.

greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ