[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0jUGf9QO6h6bcBcTX+nUbDeD0XMhWj1Qb-0qAtZ8EbVsA@mail.gmail.com>
Date: Mon, 16 Jun 2025 13:18:57 +0200
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Claudiu Beznea <claudiu.beznea@...on.dev>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Jonathan Cameron <jic23@...nel.org>,
Dmitry Torokhov <dmitry.torokhov@...il.com>, gregkh@...uxfoundation.org, dakr@...nel.org,
len.brown@...el.com, pavel@...nel.org, ulf.hansson@...aro.org,
daniel.lezcano@...aro.org, linux-kernel@...r.kernel.org,
linux-pm@...r.kernel.org, bhelgaas@...gle.com, geert@...ux-m68k.org,
linux-iio@...r.kernel.org, linux-renesas-soc@...r.kernel.org,
fabrizio.castro.jz@...esas.com,
Claudiu Beznea <claudiu.beznea.uj@...renesas.com>
Subject: Re: [PATCH v3 1/2] PM: domains: Add devres variant for dev_pm_domain_attach()
On Mon, Jun 16, 2025 at 11:37 AM Claudiu Beznea
<claudiu.beznea@...on.dev> wrote:
>
> Hi, Rafael,
>
> On 13.06.2025 13:02, Rafael J. Wysocki wrote:
> > On Fri, Jun 13, 2025 at 9:39 AM Claudiu Beznea <claudiu.beznea@...on.dev> wrote:
> >>
> >> Hi, Rafael,
> >>
> >> On 09.06.2025 22:59, Rafael J. Wysocki wrote:
> >>> On Sat, Jun 7, 2025 at 3:06 PM Jonathan Cameron <jic23@...nel.org> wrote:
> >>>>
> >>>> On Fri, 6 Jun 2025 22:01:52 +0200
> >>>> "Rafael J. Wysocki" <rafael@...nel.org> wrote:
> >>>>
> >>>> Hi Rafael,
> >>>>
> >>>>> On Fri, Jun 6, 2025 at 8:55 PM Dmitry Torokhov
> >>>>> <dmitry.torokhov@...il.com> wrote:
> >>>>>>
> >>>>>> On Fri, Jun 06, 2025 at 06:00:34PM +0200, Rafael J. Wysocki wrote:
> >>>>>>> On Fri, Jun 6, 2025 at 1:18 PM Claudiu <claudiu.beznea@...on.dev> wrote:
> >>>>>>>>
> >>>>>>>> From: Claudiu Beznea <claudiu.beznea.uj@...renesas.com>
> >>>>>>>>
> >>>>>>>> The dev_pm_domain_attach() function is typically used in bus code alongside
> >>>>>>>> dev_pm_domain_detach(), often following patterns like:
> >>>>>>>>
> >>>>>>>> static int bus_probe(struct device *_dev)
> >>>>>>>> {
> >>>>>>>> struct bus_driver *drv = to_bus_driver(dev->driver);
> >>>>>>>> struct bus_device *dev = to_bus_device(_dev);
> >>>>>>>> int ret;
> >>>>>>>>
> >>>>>>>> // ...
> >>>>>>>>
> >>>>>>>> ret = dev_pm_domain_attach(_dev, true);
> >>>>>>>> if (ret)
> >>>>>>>> return ret;
> >>>>>>>>
> >>>>>>>> if (drv->probe)
> >>>>>>>> ret = drv->probe(dev);
> >>>>>>>>
> >>>>>>>> // ...
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> static void bus_remove(struct device *_dev)
> >>>>>>>> {
> >>>>>>>> struct bus_driver *drv = to_bus_driver(dev->driver);
> >>>>>>>> struct bus_device *dev = to_bus_device(_dev);
> >>>>>>>>
> >>>>>>>> if (drv->remove)
> >>>>>>>> drv->remove(dev);
> >>>>>>>> dev_pm_domain_detach(_dev);
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> When the driver's probe function uses devres-managed resources that depend
> >>>>>>>> on the power domain state, those resources are released later during
> >>>>>>>> device_unbind_cleanup().
> >>>>>>>>
> >>>>>>>> Releasing devres-managed resources that depend on the power domain state
> >>>>>>>> after detaching the device from its PM domain can cause failures.
> >>>>>>>>
> >>>>>>>> For example, if the driver uses devm_pm_runtime_enable() in its probe
> >>>>>>>> function, and the device's clocks are managed by the PM domain, then
> >>>>>>>> during removal the runtime PM is disabled in device_unbind_cleanup() after
> >>>>>>>> the clocks have been removed from the PM domain. It may happen that the
> >>>>>>>> devm_pm_runtime_enable() action causes the device to be runtime-resumed.
> >>>>>>>
> >>>>>>> Don't use devm_pm_runtime_enable() then.
> >>>>>>
> >>>>>> What about other devm_ APIs? Are you suggesting that platform drivers
> >>>>>> should not be using devm_clk*(), devm_regulator_*(),
> >>>>>> devm_request_*_irq() and devm_add_action_or_reset()? Because again,
> >>>>>> dev_pm_domain_detach() that is called by platform bus_remove() may shut
> >>>>>> off the device too early, before cleanup code has a chance to execute
> >>>>>> proper cleanup.
> >>>>>>
> >>>>>> The issue is not limited to runtime PM.
> >>>>>>
> >>>>>>>
> >>>>>>>> If the driver specific runtime PM APIs access registers directly, this
> >>>>>>>> will lead to accessing device registers without clocks being enabled.
> >>>>>>>> Similar issues may occur with other devres actions that access device
> >>>>>>>> registers.
> >>>>>>>>
> >>>>>>>> Add devm_pm_domain_attach(). When replacing the dev_pm_domain_attach() and
> >>>>>>>> dev_pm_domain_detach() in bus probe and bus remove, it ensures that the
> >>>>>>>> device is detached from its PM domain in device_unbind_cleanup(), only
> >>>>>>>> after all driver's devres-managed resources have been release.
> >>>>>>>>
> >>>>>>>> For flexibility, the implemented devm_pm_domain_attach() has 2 state
> >>>>>>>> arguments, one for the domain state on attach, one for the domain state on
> >>>>>>>> detach.
> >>>>>>>
> >>>>>>> dev_pm_domain_attach() is not part driver API and I'm not convinced at
> >>>>>>
> >>>>>> Is the concern that devm_pm_domain_attach() will be [ab]used by drivers?
> >>>>>
> >>>>> Yes, among other things.
> >>>>
> >>>> Maybe naming could make abuse at least obvious to spot? e.g.
> >>>> pm_domain_attach_with_devm_release()
> >>>
> >>> If I'm not mistaken, it is not even necessary to use devres for this.
> >>>
> >>> You might as well add a dev_pm_domain_detach() call to
> >>> device_unbind_cleanup() after devres_release_all(). There is a slight
> >>> complication related to the second argument of it, but I suppose that
> >>> this can be determined at the attach time and stored in a new device
> >>> PM flag, or similar.
> >>>
> >>
> >> I looked into this solution. I've tested it for all my failure cases and
> >> went good.
> >
> > OK
> >
> >>> Note that dev->pm_domain is expected to be cleared by ->detach(), so
> >>> this should not cause the domain to be detached twice in a row from
> >>> the same device, but that needs to be double-checked.
> >>
> >> The genpd_dev_pm_detach() calls genpd_remove_device() ->
> >> dev_pm_domain_set(dev, NULL) which sets the dev->pm_domain = NULL. I can't
> >> find any other detach function in the current code base.
> >
> > There is also acpi_dev_pm_detach() which can be somewhat hard to find,
> > but it calls dev_pm_domain_set(dev, NULL) either.
> >
> >> The code I've tested for this solution is this one:
> >>
> >> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> >> index b526e0e0f52d..5e9750d007b4 100644
> >> --- a/drivers/base/dd.c
> >> +++ b/drivers/base/dd.c
> >> @@ -25,6 +25,7 @@
> >> #include <linux/kthread.h>
> >> #include <linux/wait.h>
> >> #include <linux/async.h>
> >> +#include <linux/pm_domain.h>
> >> #include <linux/pm_runtime.h>
> >> #include <linux/pinctrl/devinfo.h>
> >> #include <linux/slab.h>
> >> @@ -552,8 +553,11 @@ static void device_unbind_cleanup(struct device *dev)
> >> dev->dma_range_map = NULL;
> >> device_set_driver(dev, NULL);
> >> dev_set_drvdata(dev, NULL);
> >> - if (dev->pm_domain && dev->pm_domain->dismiss)
> >> - dev->pm_domain->dismiss(dev);
> >> + if (dev->pm_domain) {
> >> + if (dev->pm_domain->dismiss)
> >> + dev->pm_domain->dismiss(dev);
> >> + dev_pm_domain_detach(dev, dev->pm_domain->detach_power_off);
> >
> > I would do the "detach" before the "dismiss" to retain the current ordering.
>
> I applied on my local development branch all your suggestions except this
> one because genpd_dev_pm_detach() as well as acpi_dev_pm_detach() set
> dev->pm_domain = NULL.
>
> Due to this I would call first ->dismiss() then ->detach(), as initially
> proposed. Please let me know if you consider it otherwise.
This is a matter of adding one more dev->pm_domain check AFAICS, but OK.
Powered by blists - more mailing lists