lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPDyKFqrAS4iV59S-zJ9H7_3VuGr9JdZABhfUGBwTzQNDCasaw@mail.gmail.com>
Date: Thu, 22 May 2025 13:53:57 +0200
From: Ulf Hansson <ulf.hansson@...aro.org>
To: Claudiu Beznea <claudiu.beznea@...on.dev>
Cc: Dmitry Torokhov <dmitry.torokhov@...il.com>, Jonathan Cameron <jic23@...nel.org>, 
	"Rafael J. Wysocki" <rafael@...nel.org>, Daniel Lezcano <daniel.lezcano@...aro.org>, dakr@...nel.org, 
	linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org, 
	linux-renesas-soc@...r.kernel.org, geert@...ux-m68k.org, 
	Claudiu Beznea <claudiu.beznea.uj@...renesas.com>, 
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>, linux-iio@...r.kernel.org, bhelgaas@...gle.com
Subject: Re: [PATCH] driver core: platform: Use devres group to free driver
 probe resources

On Thu, 22 May 2025 at 11:48, Claudiu Beznea <claudiu.beznea@...on.dev> wrote:
>
> Hi, Ulf,
>
> On 21.05.2025 17:57, Dmitry Torokhov wrote:
> > On Wed, May 21, 2025 at 02:37:08PM +0200, Ulf Hansson wrote:
> >> On Wed, 21 May 2025 at 07:41, Claudiu Beznea <claudiu.beznea@...on.dev> wrote:
> >>>
> >>> Hi, Ulf,
> >>>
> >>> On 20.05.2025 15:09, Ulf Hansson wrote:
> >>>> For example, even if the order is made correctly, suppose a driver's
> >>>> ->remove() callback completes by turning off the resources for its
> >>>> device and leaves runtime PM enabled, as it relies on devres to do it
> >>>> some point later. Beyond this point, nothing would prevent userspace
> >>>> for runtime resuming/suspending the device via sysfs.
> >>>
> >>> If I'm not wrong, that can't happen? The driver_sysfs_remove() is called
> >>> before device_remove() (which calls the driver remove) is called, this
> >>> being the call path:
> >>>
> >>> device_driver_detach() ->
> >>>   device_release_driver_internal() ->
> >>>     __device_release_driver() ->
> >>>       driver_sysfs_remove()
> >>>       // ...
> >>>       device_remove()
> >>>
> >>> And the driver_sysfs_remove() calls in the end __kernfs_remove() which
> >>> looks to me like the place that actually drops the entries from sysfs, this
> >>> being a call path for it:
> >>>
> >>> driver_sysfs_remove() ->
> >>>   sysfs_remove_link() ->
> >>>     kernfs_remove_by_name() ->
> >>>       kernfs_remove_by_name_ns() ->
> >>>         __kernfs_remove() ->
> >>>
> >>> activating the following line in __kernfs_remove():
> >>>
> >>> pr_debug("kernfs %s: removing\n", kernfs_rcu_name(kn));
> >>>
> >>> leads to the following prints when unbinding the watchdog device from its
> >>> watchdog driver (attached to platform bus) on my board:
> >>> https://p.fr33tux.org/935252
> >>
> >> Indeed this is a very good point you make! I completely overlooked
> >> this fact, thanks a lot for clarifying this!
> >>
> >> However, my main point still stands.
> >>
> >> In the end, there is nothing preventing rpm_suspend|resume|idle() in
> >> drivers/base/power/runtime.c from running (don't forget runtime PM is
> >> asynchronous too) for the device in question. This could lead to that
> >> a ->runtime_suspend|resume|idle() callback becomes executed at any
> >> point in time, as long as we haven't called pm_runtime_disable() for
> >> the device.
> >
> > So exactly the same may happen if you enter driver->remove() and
> > something calls runtime API before pm_runtime_disable() is called.
> > The driver has (as they should be doing currently) be prepared for this.
>
> I took the time and tried to do a comparison of the current solutions
> (describing the bad and good things I see), trying to understand your
> concerns with regards to RPM suspend|resume|idle while unbinding a device
> from its driver.
>
> I see the following cases:
>
> Case 1/ the current approach when devm_pm_runtime_enable() is used in
> driver's ->probe() with the current code base:
>
> - right after driver ->remove() finish its execution clocks are detached
>   from the PM domain, through dev_pm_domain_detach() call in
>   platform_remove()
>
> - any subsequent RPM resume|suspend|idle will lead to failure if the driver
>   specific RPM APIs access directly registers and counts on PM domain to
>   enable/disable the clocks
>
> - at this point, if the IRQs are shared (but not only) and devm requested
>   the driver's IRQ handler can still be called asynchronously; driver
>   should be prepared for such events and should be written to work for such
>   scenarios; but as the clocks are not in the PM domain anymore and RPM is
>   still enabled at this point, if the driver don't run runtime suspend on
>   probe (and runtime resume/suspend on runtime), I think (because I haven't
>   investigated this yet) it can't rely on pm_runtime_active()/
>   pm_runtime_suspended() checks in interrupt handlers
>   and can't decide if it can interrogate registers or not; interrogating
>   should lead to failure at this stage as the clocks are disabled; drivers
>   should work in such scenario and the CONFIG_DEBUG_SHIRQ is a way to check
>   they can; I previously debugged a similar issue on drivers/net/ethernet/
>   renesas/ravb driver where using devm_pm_runtime_enable() in probe and
>   pm_runtime_suspended() checks in IRQ handlers was the way to make this
>   scenario happy; at that time I wasn't able to find that
>   dev_pm_domain_detach() have the impact discussed in this thread
>
> Case 2/ What is proposed in this patch: devm_pm_runtime_enable() used +
> open devres group after dev_pm_domain_attach() (in probe) and close the
> devres group before dev_pm_domain_attach() (in remove):
>
> - right after the driver ->remove() is executed only the driver allocated
>   devres resources are freed; this happens before dev_pm_domain_deattach()
>   is called, though the proposed devres_release_group() call in this patch
>
> - while doing this, driver can still get async RPM suspend|resume|idle
>   requests; is like the execution is in the driver ->remove()
>   but the pm_runtime_disable() hasn't been called yet
>
> - as the runtime PM is enabled in driver's ->probe() mostly after the HW is
>   prepared to take requests and all the other devm resources are allocated,
>   the RPM disable is going to be among the first things to be called by the
>   devres_release_group()
>
> - then, after RPM disable, all the devres resources allocated only in the
>   driver's ->probe() are cleaned up in reverse order, just like
>   device_unbind_cleanup() -> devres_release_all() call in
>   __device_release_driver() is doing, but limited only to the resources
>   allocated by the driver itself; I personally see this like manually
>   allocating and freeing resources in the driver itself w/o relying on
>   devres
>
> - then it comes the turn of dev_pm_domain_detach() call in
>   platform_remove(): at the time dev_pm_domain_detach() is executed the
>   runtime PM is disabled and all the devres resources allocated by driver
>   are freed as well
>
> - after the dev_pm_domain_detach() is executed all the driver resources
>   are cleaned up, the driver can't get IRQs as it's handler was already
>   unregistered, no other user can execute rpm suspend|resume|idle
>   as the RPM is disabled at this time
>
> Case 3/ devm_pm_runtime_enabled() dropped and replaced by manual cleanup:
> - the driver code is going be complicated, difficult to maintain and error
>   prone

Yes, the driver's code would become slightly more complicated, but
more importantly it would be correct.

To me it sounds like the driver's ->remove() callback could do this:

pm_runtime_get_sync()
pm_runtime_disable()
pm_runtime_put_noidle()

In this way, the driver will runtime resume its device, allowing
devres to drop/turn-off resources in the order we want. Except for the
clocks, as those would be turned off via dev_pm_domain_detach() before
the IRQ handler is freed (via devres), right?

To avoid getting the IRQ handler to be called when it can't access
registers, we could do one of the below:
*) Look for a condition in the IRQ handler and bail-out when we know
we should not manage IRQs. Is using pm_runtime_enabled() sufficient,
you think? Otherwise we need a driver specific flag, which should be
set in ->remove().
*) Don't use devm* when registering the IRQ handler.

Yes, both options further contribute to making the driver code
slightly more complicated, but if you want to solve the problem sooner
than later, I think this is what you need to do. Yet, I think there is
another option too, see below.

>
> I may have missed considering things when describing the case 2 (which is
> what is proposed by this patch) as I don't have the full picture behind the
> dev_pm_domain_detach() call in platform bus remove. If so, please correct me.

The dev_pm_domain_attach|detach() calls in bus level code
(probe/remove) were added there a long time ago, way before devres was
being used like today.

Currently we also have devm_pm_domain_attach_list(), which is used
when devices have multiple PM domains to attach too. This is *not*
called by bus-level code, but by the driver themselves. For these
cases, we would not encounter the problems you have been facing with
clocks/IRQ-handler, I think - because the devres order is maintained
for PM domains too.

That said, I think adding a devm_pm_domain_attach() interface would
make perfect sense. Then we can try to replace
dev_pm_domain_attach|detach() in bus level code, with just a call to
devm_pm_domain_attach(). In this way, we should preserve the
expectation for drivers around devres for PM domains. Even if it would
change the behaviour for some drivers, it still sounds like the
correct thing to do in my opinion.

[...]

Kind regards
Uffe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ