lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aPJiW2ROdaUuCiwi@p14s>
Date: Fri, 17 Oct 2025 09:35:55 -0600
From: Mathieu Poirier <mathieu.poirier@...aro.org>
To: Tanmay Shah <tanmay.shah@....com>
Cc: andersson@...nel.org, linux-remoteproc@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] remoteproc: core: full attach detach during recovery

On Thu, Oct 16, 2025 at 11:12:26AM -0500, Tanmay Shah wrote:
> 
> Hello,
> 
> Please find my comments below:
> 
> On 10/16/25 10:12 AM, Mathieu Poirier wrote:
> > Good morning,
> > 
> > On Thu, Oct 02, 2025 at 08:33:46AM -0700, Tanmay Shah wrote:
> > > Current recovery operation does only virtio device reset, but do not
> > > free and re-allocate all the resources. As third-party is booting the
> > > remote processor during attach-detach, it is better to free and
> > > re-allocate resoruces as resource table state might be unknown to linux
> > > when remote processor boots and reports crash.
> > 
> > 1) When referring to "third-party", should I assume boot loader?
> 
> Here, "third-party" could be a bootloader or another core in a heterogeneous
> system. In my-case it is a platform management controller.

Ok

> 
> 
> > 2) Function rproc_attach_recovery() calls __rproc_detach(), which in turn calls
> > rproc_reset_rsc_table_on_detach().  That function deals explicitly with the
> > resource table.
> 
> As per my understanding, rproc_reset_rsc_table_on_detach() will setup clean
> resource table, that sets vring addresses to 0xffffffff. Please let me know
> if this understanding is not correct.
> 
> If we do not, call rproc_attach(), then correct vring addresses are not
> setup in the resource table for next attach to work. Because,
> rproc_handle_resources() and rproc_alloc_registered_carveouts() are not
> called as part __rproc_attach().

Your assessment is correct.  When the clean_table was introduced, it was to
address the detach->attach scenario.  At that time the only recovery we
supported was to stop and start again, which did not involved the clean_table.
Re-attaching on crash was introduced later in a scenario that may not have
included a resource table.

> 
> > 3) The code in this patch mixes __rproc_detach() with rproc_attach(), something
> > that is likely not a good idea.  We either do __rproc_detach/__rproc_attach or
> > rproc_detach/rproc_attach but I'd like to avoid the mix-and-match to keep the
> > amount of possible states to a minimum.
> > 
> 
> I agree to this. I can find a way to call rproc_detach() and rproc_attach()
> sequentially, instead of __rproc_detach() and rproc_attach() calls. I might
> have to remove rproc_trigger_attach_recovery completely, but that is
> implementation details. We can work it out later, once we agree to the
> current problem & solution.
> 

Humm... You might just be able to call rproc_detach/rproc_attach from
rproc_attach_recovery() if you enhance rproc_detach to be called in a CRASHED
context [1].  Let's see what you find when trying this on real HW.

[1]. https://elixir.bootlin.com/linux/v6.17.1/source/drivers/remoteproc/remoteproc_core.c#L2065


> > If I understand correctly, the main motivation for this patch is the management
> > of the resource table.  But as noted in (2), this should be taken care of.  Am I
> > missing some information?
> > 
> 
> The main motivation is to make the attach operation works during
> attach_recovery(). The __rproc_detach() works as expected, but attach
> doesn't work. After recovery, I am not able to strat RPMsg communication.
> 
> Please let me know if I am missing something.
> 
> Thanks,
> Tanmay
> 
> > Thanks,
> > Mathieu
> > 
> > > 
> > > Signed-off-by: Tanmay Shah <tanmay.shah@....com>
> > > ---
> > > 
> > > Note: RFC patch for design discussion. Please do not merge.
> > > 
> > >   drivers/remoteproc/remoteproc_core.c | 15 ++++++++++++++-
> > >   1 file changed, 14 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > > index 825672100528..4971508bc5b2 100644
> > > --- a/drivers/remoteproc/remoteproc_core.c
> > > +++ b/drivers/remoteproc/remoteproc_core.c
> > > @@ -1786,7 +1786,20 @@ static int rproc_attach_recovery(struct rproc *rproc)
> > >   	if (ret)
> > >   		return ret;
> > > -	return __rproc_attach(rproc);
> > > +	/* clean up all acquired resources */
> > > +	rproc_resource_cleanup(rproc);
> > > +
> > > +	/* release HW resources if needed */
> > > +	rproc_unprepare_device(rproc);
> > > +
> > > +	rproc_disable_iommu(rproc);
> > > +
> > > +	/* Free the copy of the resource table */
> > > +	kfree(rproc->cached_table);
> > > +	rproc->cached_table = NULL;
> > > +	rproc->table_ptr = NULL;
> > > +
> > > +	return rproc_attach(rproc);
> > >   }
> > >   static int rproc_boot_recovery(struct rproc *rproc)
> > > 
> > > base-commit: 56d030ea3330ab737fe6c05f89d52f56208b07ac
> > > -- 
> > > 2.34.1
> > > 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ