lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJZ5v0hi2N9iFy-Dh0mkN3zDBytMFXqxosujbPGO5JxnWhBxmg@mail.gmail.com>
Date: Sat, 21 Jun 2025 13:15:45 +0200
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Ulf Hansson <ulf.hansson@...aro.org>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Claudiu Beznea <claudiu.beznea@...on.dev>, 
	Jonathan Cameron <jic23@...nel.org>, Dmitry Torokhov <dmitry.torokhov@...il.com>, gregkh@...uxfoundation.org, 
	dakr@...nel.org, len.brown@...el.com, pavel@...nel.org, 
	daniel.lezcano@...aro.org, linux-kernel@...r.kernel.org, 
	linux-pm@...r.kernel.org, bhelgaas@...gle.com, geert@...ux-m68k.org, 
	linux-iio@...r.kernel.org, linux-renesas-soc@...r.kernel.org, 
	fabrizio.castro.jz@...esas.com, 
	Claudiu Beznea <claudiu.beznea.uj@...renesas.com>
Subject: Re: [PATCH v3 1/2] PM: domains: Add devres variant for dev_pm_domain_attach()

On Thu, Jun 19, 2025 at 2:21 PM Ulf Hansson <ulf.hansson@...aro.org> wrote:
>
> On Mon, 16 Jun 2025 at 13:47, Rafael J. Wysocki <rafael@...nel.org> wrote:
> >
> > On Mon, Jun 16, 2025 at 1:37 PM Claudiu Beznea <claudiu.beznea@...on.dev> wrote:
> > >
> > >
> > >
> > > On 16.06.2025 14:18, Rafael J. Wysocki wrote:
> > > > On Mon, Jun 16, 2025 at 11:37 AM Claudiu Beznea
> > > > <claudiu.beznea@...on.dev> wrote:
> > > >>
> > > >> Hi, Rafael,
> > > >>
> > > >> On 13.06.2025 13:02, Rafael J. Wysocki wrote:
> > > >>> On Fri, Jun 13, 2025 at 9:39 AM Claudiu Beznea <claudiu.beznea@...on.dev> wrote:
> > > >>>>
> > > >>>> Hi, Rafael,
> > > >>>>
> > > >>>> On 09.06.2025 22:59, Rafael J. Wysocki wrote:
> > > >>>>> On Sat, Jun 7, 2025 at 3:06 PM Jonathan Cameron <jic23@...nel.org> wrote:
> > > >>>>>>
> > > >>>>>> On Fri, 6 Jun 2025 22:01:52 +0200
> > > >>>>>> "Rafael J. Wysocki" <rafael@...nel.org> wrote:
> > > >>>>>>
> > > >>>>>> Hi Rafael,
> > > >>>>>>
> > > >>>>>>> On Fri, Jun 6, 2025 at 8:55 PM Dmitry Torokhov
> > > >>>>>>> <dmitry.torokhov@...il.com> wrote:
> > > >>>>>>>>
> > > >>>>>>>> On Fri, Jun 06, 2025 at 06:00:34PM +0200, Rafael J. Wysocki wrote:
> > > >>>>>>>>> On Fri, Jun 6, 2025 at 1:18 PM Claudiu <claudiu.beznea@...on.dev> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>> From: Claudiu Beznea <claudiu.beznea.uj@...renesas.com>
> > > >>>>>>>>>>
> > > >>>>>>>>>> The dev_pm_domain_attach() function is typically used in bus code alongside
> > > >>>>>>>>>> dev_pm_domain_detach(), often following patterns like:
> > > >>>>>>>>>>
> > > >>>>>>>>>> static int bus_probe(struct device *_dev)
> > > >>>>>>>>>> {
> > > >>>>>>>>>>     struct bus_driver *drv = to_bus_driver(dev->driver);
> > > >>>>>>>>>>     struct bus_device *dev = to_bus_device(_dev);
> > > >>>>>>>>>>     int ret;
> > > >>>>>>>>>>
> > > >>>>>>>>>>     // ...
> > > >>>>>>>>>>
> > > >>>>>>>>>>     ret = dev_pm_domain_attach(_dev, true);
> > > >>>>>>>>>>     if (ret)
> > > >>>>>>>>>>         return ret;
> > > >>>>>>>>>>
> > > >>>>>>>>>>     if (drv->probe)
> > > >>>>>>>>>>         ret = drv->probe(dev);
> > > >>>>>>>>>>
> > > >>>>>>>>>>     // ...
> > > >>>>>>>>>> }
> > > >>>>>>>>>>
> > > >>>>>>>>>> static void bus_remove(struct device *_dev)
> > > >>>>>>>>>> {
> > > >>>>>>>>>>     struct bus_driver *drv = to_bus_driver(dev->driver);
> > > >>>>>>>>>>     struct bus_device *dev = to_bus_device(_dev);
> > > >>>>>>>>>>
> > > >>>>>>>>>>     if (drv->remove)
> > > >>>>>>>>>>         drv->remove(dev);
> > > >>>>>>>>>>     dev_pm_domain_detach(_dev);
> > > >>>>>>>>>> }
> > > >>>>>>>>>>
> > > >>>>>>>>>> When the driver's probe function uses devres-managed resources that depend
> > > >>>>>>>>>> on the power domain state, those resources are released later during
> > > >>>>>>>>>> device_unbind_cleanup().
> > > >>>>>>>>>>
> > > >>>>>>>>>> Releasing devres-managed resources that depend on the power domain state
> > > >>>>>>>>>> after detaching the device from its PM domain can cause failures.
> > > >>>>>>>>>>
> > > >>>>>>>>>> For example, if the driver uses devm_pm_runtime_enable() in its probe
> > > >>>>>>>>>> function, and the device's clocks are managed by the PM domain, then
> > > >>>>>>>>>> during removal the runtime PM is disabled in device_unbind_cleanup() after
> > > >>>>>>>>>> the clocks have been removed from the PM domain. It may happen that the
> > > >>>>>>>>>> devm_pm_runtime_enable() action causes the device to be runtime-resumed.
> > > >>>>>>>>>
> > > >>>>>>>>> Don't use devm_pm_runtime_enable() then.
> > > >>>>>>>>
> > > >>>>>>>> What about other devm_ APIs? Are you suggesting that platform drivers
> > > >>>>>>>> should not be using devm_clk*(), devm_regulator_*(),
> > > >>>>>>>> devm_request_*_irq() and devm_add_action_or_reset()? Because again,
> > > >>>>>>>> dev_pm_domain_detach() that is called by platform bus_remove() may shut
> > > >>>>>>>> off the device too early, before cleanup code has a chance to execute
> > > >>>>>>>> proper cleanup.
> > > >>>>>>>>
> > > >>>>>>>> The issue is not limited to runtime PM.
> > > >>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>> If the driver specific runtime PM APIs access registers directly, this
> > > >>>>>>>>>> will lead to accessing device registers without clocks being enabled.
> > > >>>>>>>>>> Similar issues may occur with other devres actions that access device
> > > >>>>>>>>>> registers.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Add devm_pm_domain_attach(). When replacing the dev_pm_domain_attach() and
> > > >>>>>>>>>> dev_pm_domain_detach() in bus probe and bus remove, it ensures that the
> > > >>>>>>>>>> device is detached from its PM domain in device_unbind_cleanup(), only
> > > >>>>>>>>>> after all driver's devres-managed resources have been release.
> > > >>>>>>>>>>
> > > >>>>>>>>>> For flexibility, the implemented devm_pm_domain_attach() has 2 state
> > > >>>>>>>>>> arguments, one for the domain state on attach, one for the domain state on
> > > >>>>>>>>>> detach.
> > > >>>>>>>>>
> > > >>>>>>>>> dev_pm_domain_attach() is not part driver API and I'm not convinced at
> > > >>>>>>>>
> > > >>>>>>>> Is the concern that devm_pm_domain_attach() will be [ab]used by drivers?
> > > >>>>>>>
> > > >>>>>>> Yes, among other things.
> > > >>>>>>
> > > >>>>>> Maybe naming could make abuse at least obvious to spot? e.g.
> > > >>>>>> pm_domain_attach_with_devm_release()
> > > >>>>>
> > > >>>>> If I'm not mistaken, it is not even necessary to use devres for this.
> > > >>>>>
> > > >>>>> You might as well add a dev_pm_domain_detach() call to
> > > >>>>> device_unbind_cleanup() after devres_release_all().  There is a slight
> > > >>>>> complication related to the second argument of it, but I suppose that
> > > >>>>> this can be determined at the attach time and stored in a new device
> > > >>>>> PM flag, or similar.
> > > >>>>>
> > > >>>>
> > > >>>> I looked into this solution. I've tested it for all my failure cases and
> > > >>>> went good.
> > > >>>
> > > >>> OK
> > > >>>
> > > >>>>> Note that dev->pm_domain is expected to be cleared by ->detach(), so
> > > >>>>> this should not cause the domain to be detached twice in a row from
> > > >>>>> the same device, but that needs to be double-checked.
> > > >>>>
> > > >>>> The genpd_dev_pm_detach() calls genpd_remove_device() ->
> > > >>>> dev_pm_domain_set(dev, NULL) which sets the dev->pm_domain = NULL. I can't
> > > >>>> find any other detach function in the current code base.
> > > >>>
> > > >>> There is also acpi_dev_pm_detach() which can be somewhat hard to find,
> > > >>> but it calls dev_pm_domain_set(dev, NULL) either.
> > > >>>
> > > >>>> The code I've tested for this solution is this one:
> > > >>>>
> > > >>>> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > > >>>> index b526e0e0f52d..5e9750d007b4 100644
> > > >>>> --- a/drivers/base/dd.c
> > > >>>> +++ b/drivers/base/dd.c
> > > >>>> @@ -25,6 +25,7 @@
> > > >>>>  #include <linux/kthread.h>
> > > >>>>  #include <linux/wait.h>
> > > >>>>  #include <linux/async.h>
> > > >>>> +#include <linux/pm_domain.h>
> > > >>>>  #include <linux/pm_runtime.h>
> > > >>>>  #include <linux/pinctrl/devinfo.h>
> > > >>>>  #include <linux/slab.h>
> > > >>>> @@ -552,8 +553,11 @@ static void device_unbind_cleanup(struct device *dev)
> > > >>>>         dev->dma_range_map = NULL;
> > > >>>>         device_set_driver(dev, NULL);
> > > >>>>         dev_set_drvdata(dev, NULL);
> > > >>>> -       if (dev->pm_domain && dev->pm_domain->dismiss)
> > > >>>> -               dev->pm_domain->dismiss(dev);
> > > >>>> +       if (dev->pm_domain) {
> > > >>>> +               if (dev->pm_domain->dismiss)
> > > >>>> +                       dev->pm_domain->dismiss(dev);
> > > >>>> +               dev_pm_domain_detach(dev, dev->pm_domain->detach_power_off);
> > > >>>
> > > >>> I would do the "detach" before the "dismiss" to retain the current ordering.
> > > >>
> > > >> I applied on my local development branch all your suggestions except this
> > > >> one because genpd_dev_pm_detach() as well as acpi_dev_pm_detach() set
> > > >> dev->pm_domain = NULL.
> > > >>
> > > >> Due to this I would call first ->dismiss() then ->detach(), as initially
> > > >> proposed. Please let me know if you consider it otherwise.
> > > >
> > > > This is a matter of adding one more dev->pm_domain check AFAICS, but OK.
> > >
> > > I don't know all the subtleties around this, my concern with adding one
> > > more check on dev->pm_domain was that the
> > > dev->pm_domain->dismiss() will never be called if the ->detach() function
> > > will be called before ->dismiss() and it will set dev->pm_domain = NULL (as
> > > it does today (though genpd_dev_pm_detach() and acpi_dev_pm_detach())).
> > >
> > > For platform drivers that used to call dev_pm_domain_detach() in platform
> > > bus remove function, if I'm not wrong, the dev->pm_domain->dismiss() was
> > > never called previously. If that is a valid scenario, the code proposed in
> > > this thread will change the behavior for devices that have ->dismiss()
> > > implemented.
> >
> > ->dismiss() and ->detach() are supposed to be mutually exclusive, so
> > this should not be a problem either way and in practice so far the
> > only user of ->dismiss() has been acpi_lpss_pm_domain which doesn't do
> > ->detach() at all.
>
> May I ask if you know if there remains any real good reasons to keep
> the ->dismiss() callback around?
>
> Can't acpi_lpss_pm_domain() convert to use the ->detach() callback
> instead? Just thinking that it would be easier, but maybe it doesn't
> work.

It will, but let's just not make too many changes in one go.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ