lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z8k-OHsYvARkf3Rt@google.com>
Date: Wed, 5 Mar 2025 22:18:32 -0800
From: Dmitry Torokhov <dmitry.torokhov@...il.com>
To: Jonathan Cameron <jic23@...nel.org>
Cc: Claudiu <claudiu.beznea@...on.dev>,
	prabhakar.mahadev-lad.rj@...renesas.com, lars@...afoo.de,
	linux-iio@...r.kernel.org, linux-renesas-soc@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Claudiu Beznea <claudiu.beznea.uj@...renesas.com>,
	gregkh@...uxfoundation.org
Subject: Re: [PATCH v3 1/2] iio: adc: rzg2l_adc: Open a devres group

On Wed, Mar 05, 2025 at 02:21:22PM +0000, Jonathan Cameron wrote:
> On Mon, 24 Feb 2025 14:06:06 +0200
> Claudiu <claudiu.beznea@...on.dev> wrote:
> 
> > From: Claudiu Beznea <claudiu.beznea.uj@...renesas.com>
> > 
> > On all systems where the rzg2l_adc driver is used, the ADC clocks are part
> > of a PM domain. The code that implements the PM domains support is in
> > drivers/clk/renesas/rzg2l-cpg.c, the functions of interest for this commit
> > being rzg2l_cpg_attach_dev() and rzg2l_cpg_deattach_dev(). The PM
> > domains support is registered with GENPD_FLAG_PM_CLK which, according to
> > the documentation, instructs genpd to use the PM clk framework while
> > powering on/off attached devices.
> > 
> > During probe, the ADC device is attached to the PM domain
> > controlling the ADC clocks. Similarly, during removal, the ADC device is
> > detached from the PM domain.
> > 
> > The detachment call stack is as follows:
> > 
> > device_driver_detach() ->
> >   device_release_driver_internal() ->
> >     __device_release_driver() ->
> >       device_remove() ->
> >         platform_remove() ->
> >           dev_pm_domain_detach()
> > 
> > During driver unbind, after the ADC device is detached from its PM domain,
> > the device_unbind_cleanup() function is called, which subsequently invokes
> > devres_release_all(). This function handles devres resource cleanup.
> > 
> > If runtime PM is enabled via devm_pm_runtime_enable(), the cleanup process
> > triggers the action or reset function for disabling runtime PM. This
> > function is pm_runtime_disable_action(), which leads to the following call
> > stack of interest when called:
> > 
> > pm_runtime_disable_action() ->
> >   pm_runtime_dont_use_autosuspend() ->
> >     __pm_runtime_use_autosuspend() ->
> >       update_autosuspend() ->
> >         rpm_idle()
> > 
> > The rpm_idle() function attempts to runtime resume the ADC device. However,
> > at the point it is called, the ADC device is no longer part of the PM
> > domain (which manages the ADC clocks). Since the rzg2l_adc runtime PM
> > APIs directly modifies hardware registers, the
> > rzg2l_adc_pm_runtime_resume() function is invoked without the ADC clocks
> > being enabled. This is because the PM domain no longer resumes along with
> > the ADC device. As a result, this leads to system aborts.
> > 
> > Open a devres group in the driver probe and release it in the driver
> > remove. This ensures the runtime PM is disabled (though the devres group)
> > after the rzg2l_adc_remove() finishes its execution avoiding the described
> > scenario.
> > 
> > Fixes: 89ee8174e8c8 ("iio: adc: rzg2l_adc: Simplify the runtime PM code")
> > Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@...renesas.com>
> 
> +CC Greg KH given the in driver suggestion was his and I think
> this discussion is not necessarily over!  Also Dmitry for his info.

So let's ignore this specific driver. We know that at the moment
platform_remove() calls dev_pm_domain_detach() without waiting for
drivers to release their devm-managed resources. This means that devices
are potentially powered down and unresponsive when devm release handlers
are run. This behavior affects _all_ platform drivers using devm APIs.

We should not and can not go and require a fix to each individual
driver. This has to be solved at bus level. We have much fewer buses
than devices.

Driver core might supply helpers, but not all buses need this and like I
mentioned elsewhere buses themselves may want to switch to devm APIs for
their resources.

Thanks.

-- 
Dmitry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ