lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPDyKFpLF2P438GGWSgbXzpT7JNdUjtZ2ZxYf1_4=fNUX3s-KQ@mail.gmail.com>
Date: Thu, 22 May 2025 18:28:44 +0200
From: Ulf Hansson <ulf.hansson@...aro.org>
To: Claudiu Beznea <claudiu.beznea@...on.dev>
Cc: Dmitry Torokhov <dmitry.torokhov@...il.com>, Jonathan Cameron <jic23@...nel.org>, 
	"Rafael J. Wysocki" <rafael@...nel.org>, Daniel Lezcano <daniel.lezcano@...aro.org>, dakr@...nel.org, 
	linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org, 
	linux-renesas-soc@...r.kernel.org, geert@...ux-m68k.org, 
	Claudiu Beznea <claudiu.beznea.uj@...renesas.com>, 
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>, linux-iio@...r.kernel.org, bhelgaas@...gle.com
Subject: Re: [PATCH] driver core: platform: Use devres group to free driver
 probe resources

On Thu, 22 May 2025 at 16:08, Claudiu Beznea <claudiu.beznea@...on.dev> wrote:
>
> Hi, Ulf,
>
> On 22.05.2025 14:53, Ulf Hansson wrote:
> > On Thu, 22 May 2025 at 11:48, Claudiu Beznea <claudiu.beznea@...on.dev> wrote:
> >>
> >> Hi, Ulf,
> >>
> >> On 21.05.2025 17:57, Dmitry Torokhov wrote:
> >>> On Wed, May 21, 2025 at 02:37:08PM +0200, Ulf Hansson wrote:
> >>>> On Wed, 21 May 2025 at 07:41, Claudiu Beznea <claudiu.beznea@...on.dev> wrote:
> >>>>>
> >>>>> Hi, Ulf,
> >>>>>
> >>>>> On 20.05.2025 15:09, Ulf Hansson wrote:
> >>>>>> For example, even if the order is made correctly, suppose a driver's
> >>>>>> ->remove() callback completes by turning off the resources for its
> >>>>>> device and leaves runtime PM enabled, as it relies on devres to do it
> >>>>>> some point later. Beyond this point, nothing would prevent userspace
> >>>>>> for runtime resuming/suspending the device via sysfs.
> >>>>>
> >>>>> If I'm not wrong, that can't happen? The driver_sysfs_remove() is called
> >>>>> before device_remove() (which calls the driver remove) is called, this
> >>>>> being the call path:
> >>>>>
> >>>>> device_driver_detach() ->
> >>>>>   device_release_driver_internal() ->
> >>>>>     __device_release_driver() ->
> >>>>>       driver_sysfs_remove()
> >>>>>       // ...
> >>>>>       device_remove()
> >>>>>
> >>>>> And the driver_sysfs_remove() calls in the end __kernfs_remove() which
> >>>>> looks to me like the place that actually drops the entries from sysfs, this
> >>>>> being a call path for it:
> >>>>>
> >>>>> driver_sysfs_remove() ->
> >>>>>   sysfs_remove_link() ->
> >>>>>     kernfs_remove_by_name() ->
> >>>>>       kernfs_remove_by_name_ns() ->
> >>>>>         __kernfs_remove() ->
> >>>>>
> >>>>> activating the following line in __kernfs_remove():
> >>>>>
> >>>>> pr_debug("kernfs %s: removing\n", kernfs_rcu_name(kn));
> >>>>>
> >>>>> leads to the following prints when unbinding the watchdog device from its
> >>>>> watchdog driver (attached to platform bus) on my board:
> >>>>> https://p.fr33tux.org/935252
> >>>>
> >>>> Indeed this is a very good point you make! I completely overlooked
> >>>> this fact, thanks a lot for clarifying this!
> >>>>
> >>>> However, my main point still stands.
> >>>>
> >>>> In the end, there is nothing preventing rpm_suspend|resume|idle() in
> >>>> drivers/base/power/runtime.c from running (don't forget runtime PM is
> >>>> asynchronous too) for the device in question. This could lead to that
> >>>> a ->runtime_suspend|resume|idle() callback becomes executed at any
> >>>> point in time, as long as we haven't called pm_runtime_disable() for
> >>>> the device.
> >>>
> >>> So exactly the same may happen if you enter driver->remove() and
> >>> something calls runtime API before pm_runtime_disable() is called.
> >>> The driver has (as they should be doing currently) be prepared for this.
> >>
> >> I took the time and tried to do a comparison of the current solutions
> >> (describing the bad and good things I see), trying to understand your
> >> concerns with regards to RPM suspend|resume|idle while unbinding a device
> >> from its driver.
> >>
> >> I see the following cases:
> >>
> >> Case 1/ the current approach when devm_pm_runtime_enable() is used in
> >> driver's ->probe() with the current code base:
> >>
> >> - right after driver ->remove() finish its execution clocks are detached
> >>   from the PM domain, through dev_pm_domain_detach() call in
> >>   platform_remove()
> >>
> >> - any subsequent RPM resume|suspend|idle will lead to failure if the driver
> >>   specific RPM APIs access directly registers and counts on PM domain to
> >>   enable/disable the clocks
> >>
> >> - at this point, if the IRQs are shared (but not only) and devm requested
> >>   the driver's IRQ handler can still be called asynchronously; driver
> >>   should be prepared for such events and should be written to work for such
> >>   scenarios; but as the clocks are not in the PM domain anymore and RPM is
> >>   still enabled at this point, if the driver don't run runtime suspend on
> >>   probe (and runtime resume/suspend on runtime), I think (because I haven't
> >>   investigated this yet) it can't rely on pm_runtime_active()/
> >>   pm_runtime_suspended() checks in interrupt handlers
> >>   and can't decide if it can interrogate registers or not; interrogating
> >>   should lead to failure at this stage as the clocks are disabled; drivers
> >>   should work in such scenario and the CONFIG_DEBUG_SHIRQ is a way to check
> >>   they can; I previously debugged a similar issue on drivers/net/ethernet/
> >>   renesas/ravb driver where using devm_pm_runtime_enable() in probe and
> >>   pm_runtime_suspended() checks in IRQ handlers was the way to make this
> >>   scenario happy; at that time I wasn't able to find that
> >>   dev_pm_domain_detach() have the impact discussed in this thread
> >>
> >> Case 2/ What is proposed in this patch: devm_pm_runtime_enable() used +
> >> open devres group after dev_pm_domain_attach() (in probe) and close the
> >> devres group before dev_pm_domain_attach() (in remove):
> >>
> >> - right after the driver ->remove() is executed only the driver allocated
> >>   devres resources are freed; this happens before dev_pm_domain_deattach()
> >>   is called, though the proposed devres_release_group() call in this patch
> >>
> >> - while doing this, driver can still get async RPM suspend|resume|idle
> >>   requests; is like the execution is in the driver ->remove()
> >>   but the pm_runtime_disable() hasn't been called yet
> >>
> >> - as the runtime PM is enabled in driver's ->probe() mostly after the HW is
> >>   prepared to take requests and all the other devm resources are allocated,
> >>   the RPM disable is going to be among the first things to be called by the
> >>   devres_release_group()
> >>
> >> - then, after RPM disable, all the devres resources allocated only in the
> >>   driver's ->probe() are cleaned up in reverse order, just like
> >>   device_unbind_cleanup() -> devres_release_all() call in
> >>   __device_release_driver() is doing, but limited only to the resources
> >>   allocated by the driver itself; I personally see this like manually
> >>   allocating and freeing resources in the driver itself w/o relying on
> >>   devres
> >>
> >> - then it comes the turn of dev_pm_domain_detach() call in
> >>   platform_remove(): at the time dev_pm_domain_detach() is executed the
> >>   runtime PM is disabled and all the devres resources allocated by driver
> >>   are freed as well
> >>
> >> - after the dev_pm_domain_detach() is executed all the driver resources
> >>   are cleaned up, the driver can't get IRQs as it's handler was already
> >>   unregistered, no other user can execute rpm suspend|resume|idle
> >>   as the RPM is disabled at this time
> >>
> >> Case 3/ devm_pm_runtime_enabled() dropped and replaced by manual cleanup:
> >> - the driver code is going be complicated, difficult to maintain and error
> >>   prone
> >
> > Yes, the driver's code would become slightly more complicated, but
> > more importantly it would be correct.
> >
> > To me it sounds like the driver's ->remove() callback could do this:
> >
> > pm_runtime_get_sync()
> > pm_runtime_disable()
> > pm_runtime_put_noidle()
>
> In my case it was just pm_runtime_disable() at the end of driver ->remove()
> and pm_runtime_active() checks in IRQ handlers which didn't worked after
> driver ->remove() finished execution due to disable_depth being incremented
> in pm_runtime_disable(). The IRQs were devm requested.
>
> The solution found at the that time was to use devm_pm_runtime_enable() in
> probe and pm_runtime_suspended() calls in IRQ handlers.
>
> >
> > In this way, the driver will runtime resume its device, allowing
> > devres to drop/turn-off resources in the order we want.
> > Except for the
> > clocks, as those would be turned off via dev_pm_domain_detach() before
> > the IRQ handler is freed (via devres), right?
> >
> > To avoid getting the IRQ handler to be called when it can't access
> > registers, we could do one of the below:
> > *) Look for a condition in the IRQ handler and bail-out when we know
> > we should not manage IRQs. Is using pm_runtime_enabled() sufficient,
> > you think? Otherwise we need a driver specific flag, which should be
> > set in ->remove().
> > *) Don't use devm* when registering the IRQ handler.
>
> That's true.
>
> >
> > Yes, both options further contribute to making the driver code
> > slightly more complicated, but if you want to solve the problem sooner
> > than later, I think this is what you need to do. Yet, I think there is
> > another option too, see below.
> >
> >>
> >> I may have missed considering things when describing the case 2 (which is
> >> what is proposed by this patch) as I don't have the full picture behind the
> >> dev_pm_domain_detach() call in platform bus remove. If so, please correct me.
> >
> > The dev_pm_domain_attach|detach() calls in bus level code
> > (probe/remove) were added there a long time ago, way before devres was
> > being used like today.
> >
> > Currently we also have devm_pm_domain_attach_list(), which is used
> > when devices have multiple PM domains to attach too. This is *not*
> > called by bus-level code, but by the driver themselves. For these
> > cases, we would not encounter the problems you have been facing with
> > clocks/IRQ-handler, I think - because the devres order is maintained
> > for PM domains too.
> >
> > That said, I think adding a devm_pm_domain_attach() interface would
> > make perfect sense. Then we can try to replace
> > dev_pm_domain_attach|detach() in bus level code, with just a call to
> > devm_pm_domain_attach(). In this way, we should preserve the
> > expectation for drivers around devres for PM domains. Even if it would
> > change the behaviour for some drivers, it still sounds like the
> > correct thing to do in my opinion.
>
> This looks good to me, as well. I did prototype it on my side and tested on
> all my failure cases and it works.

That's great! I am happy to help review, if/when you decide to post it.

Kind regards
Uffe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ