lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <DU0PR04MB94174193978D834B7C934E9F880F9@DU0PR04MB9417.eurprd04.prod.outlook.com>
Date:   Mon, 14 Mar 2022 07:31:44 +0000
From:   Peng Fan <peng.fan@....com>
To:     Bjorn Andersson <bjorn.andersson@...aro.org>,
        "Peng Fan (OSS)" <peng.fan@....nxp.com>
CC:     "mathieu.poirier@...aro.org" <mathieu.poirier@...aro.org>,
        "arnaud.pouliquen@...s.st.com" <arnaud.pouliquen@...s.st.com>,
        "linux-remoteproc@...r.kernel.org" <linux-remoteproc@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH V2 2/2] remoteproc: support attach recovery after rproc
 crash

> Subject: Re: [PATCH V2 2/2] remoteproc: support attach recovery after rproc
> crash
> 
> On Tue 08 Mar 00:48 CST 2022, Peng Fan (OSS) wrote:
> 
> > From: Peng Fan <peng.fan@....com>
> >
> > Current logic only support main processor to stop/start the remote
> > processor after rproc crash. However to SoC, such as i.MX8QM/QXP, the
> > remote processor could do attach recovery after crash and trigger
> > watchdog
> 
> Does it really do something called "attach recovery and trigger watchdog
> reboot"? Doesn't it just reboot itself and Linux needs to detach and reattach
> to get something (what?) reset?

I mean the remote processor could re-run without linux to load firmware/stop/
start. Linux side needs to detach/attach to communicate with remote processor.

> 
> > reboot. It does not need main processor to load image, or stop/start
> > M4 core.
> >
> > Introduce two functions: rproc_attach_recovery,
> > rproc_firmware_recovery for the two cases. Firmware recovery is as
> > before, let main processor to help recovery, while attach recovery is recover
> itself withou help.
> > To attach recovery, we only do detach and attach.
> >
> > Signed-off-by: Peng Fan <peng.fan@....com>
> > ---
> >
> > V2:
> >  use rproc_has_feature in patch 1/2
> >
> >  drivers/remoteproc/remoteproc_core.c | 67
> > ++++++++++++++++++++--------
> >  1 file changed, 48 insertions(+), 19 deletions(-)
> >
> > diff --git a/drivers/remoteproc/remoteproc_core.c
> > b/drivers/remoteproc/remoteproc_core.c
> > index 69f51acf235e..366fad475898 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -1887,6 +1887,50 @@ static int __rproc_detach(struct rproc *rproc)
> >  	return 0;
> >  }
> >
> > +static int rproc_attach_recovery(struct rproc *rproc) {
> > +	int ret;
> > +
> > +	mutex_unlock(&rproc->lock);
> > +	ret = rproc_detach(rproc);
> > +	mutex_lock(&rproc->lock);
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (atomic_inc_return(&rproc->power) > 1)
> 
> In the stop/coredump/start path the code _will_ attempt to recover the
> remote processor. With rproc_detach() and rproc_attach() fiddling with the
> rproc->power refcount this might do something, or it might not do something.
> And with the mutex_unlock() it's likely that you're opening of up for various
> race conditions inbetween.

Rproc_boot will inc rproc->power.
Rproc_detach will decrease rproc->power
Rproc_attach not touch rproc->power.

When do attach recovery, the logic is detach->attach. So I add one
inc rproc->power check to avoid count mis-usage.

> 
> 
> PS. Does anyone actually use this refcount, or are we just all holding our
> breath for it never going beyond 1?

I think latter usage.

Thanks,
Peng.

> 
> Regards,
> Bjorn
> 
> > +		return 0;
> > +
> > +	return rproc_attach(rproc);
> > +}
> > +
> > +static int rproc_firmware_recovery(struct rproc *rproc) {
> > +	const struct firmware *firmware_p;
> > +	struct device *dev = &rproc->dev;
> > +	int ret;
> > +
> > +	ret = rproc_stop(rproc, true);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* generate coredump */
> > +	rproc->ops->coredump(rproc);
> > +
> > +	/* load firmware */
> > +	ret = request_firmware(&firmware_p, rproc->firmware, dev);
> > +	if (ret < 0) {
> > +		dev_err(dev, "request_firmware failed: %d\n", ret);
> > +		return ret;
> > +	}
> > +
> > +	/* boot the remote processor up again */
> > +	ret = rproc_start(rproc, firmware_p);
> > +
> > +	release_firmware(firmware_p);
> > +
> > +	return ret;
> > +}
> > +
> >  /**
> >   * rproc_trigger_recovery() - recover a remoteproc
> >   * @rproc: the remote processor
> > @@ -1901,7 +1945,6 @@ static int __rproc_detach(struct rproc *rproc)
> >   */
> >  int rproc_trigger_recovery(struct rproc *rproc)  {
> > -	const struct firmware *firmware_p;
> >  	struct device *dev = &rproc->dev;
> >  	int ret;
> >
> > @@ -1915,24 +1958,10 @@ int rproc_trigger_recovery(struct rproc
> > *rproc)
> >
> >  	dev_err(dev, "recovering %s\n", rproc->name);
> >
> > -	ret = rproc_stop(rproc, true);
> > -	if (ret)
> > -		goto unlock_mutex;
> > -
> > -	/* generate coredump */
> > -	rproc->ops->coredump(rproc);
> > -
> > -	/* load firmware */
> > -	ret = request_firmware(&firmware_p, rproc->firmware, dev);
> > -	if (ret < 0) {
> > -		dev_err(dev, "request_firmware failed: %d\n", ret);
> > -		goto unlock_mutex;
> > -	}
> > -
> > -	/* boot the remote processor up again */
> > -	ret = rproc_start(rproc, firmware_p);
> > -
> > -	release_firmware(firmware_p);
> > +	if (rproc_has_feature(rproc, RPROC_FEAT_ATTACH_RECOVERY))
> > +		ret = rproc_attach_recovery(rproc);
> > +	else
> > +		ret = rproc_firmware_recovery(rproc);
> >
> >  unlock_mutex:
> >  	mutex_unlock(&rproc->lock);
> > --
> > 2.30.0
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ