lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPDyKFrVeRYfRFHX4Tq-mSAFabBTxbJfQoTXaxb2OA+Gbm8tAg@mail.gmail.com>
Date:   Tue, 12 Feb 2019 09:25:59 +0100
From:   Ulf Hansson <ulf.hansson@...aro.org>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux PM <linux-pm@...r.kernel.org>,
        Daniel Vetter <daniel@...ll.ch>,
        Lukas Wunner <lukas@...ner.de>,
        Andrzej Hajda <a.hajda@...sung.com>,
        Russell King - ARM Linux <linux@...linux.org.uk>,
        Lucas Stach <l.stach@...gutronix.de>,
        Linus Walleij <linus.walleij@...aro.org>,
        Thierry Reding <thierry.reding@...il.com>,
        Laurent Pinchart <laurent.pinchart@...asonboard.com>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Joerg Roedel <jroedel@...e.de>
Subject: Re: [PATCH 1/2] PM-runtime: Take suppliers into account in __pm_runtime_set_status()

On Mon, 11 Feb 2019 at 23:41, Rafael J. Wysocki <rafael@...nel.org> wrote:
>
>  On Mon, Feb 11, 2019 at 2:28 PM Ulf Hansson <ulf.hansson@...aro.org> wrote:
> >
> > On Thu, 7 Feb 2019 at 19:46, Rafael J. Wysocki <rjw@...ysocki.net> wrote:
> > >
> > > From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> > >
> > > If the target device has any suppliers, as reflected by device links
> > > to them, __pm_runtime_set_status() does not take them into account,
> > > which is not consistent with the other parts of the PM-runtime
> > > framework and may lead to programming mistakes.
> > >
> > > Modify __pm_runtime_set_status() to take suppliers into account by
> > > activating them upfront if the new status is RPM_ACTIVE and
> > > deactivating them on exit if the new status is RPM_SUSPENDED.
> > >
> > > If the activation of one of the suppliers fails, the new status
> > > will be RPM_SUSPENDED and the (remaining) suppliers will be
> > > deactivated on exit (the child count of the device's parent
> > > will be dropped too then).
> > >
> > > Of course, adding device links locking to __pm_runtime_set_status()
> > > means that it cannot be run fron interrupt context, so make it use
> > > spin_lock_irq() and spin_unlock_irq() instead of spin_lock_irqsave()
> > > and spin_unlock_irqrestore(), respectively.
> > >
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> >
> > Rafael, thanks for working on this!
> >
> > I am running some tests at my side, but still not achieving the
> > behavior I expect to. Will let you know when I have more details, but
> > first some comments below.
> >
> > > ---
> > >  drivers/base/power/runtime.c |   45 ++++++++++++++++++++++++++++++++++++++-----
> > >  1 file changed, 40 insertions(+), 5 deletions(-)
> > >
> > > Index: linux-pm/drivers/base/power/runtime.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/base/power/runtime.c
> > > +++ linux-pm/drivers/base/power/runtime.c
> > > @@ -1102,20 +1102,43 @@ EXPORT_SYMBOL_GPL(pm_runtime_get_if_in_u
> > >   * and the device parent's counter of unsuspended children is modified to
> > >   * reflect the new status.  If the new status is RPM_SUSPENDED, an idle
> > >   * notification request for the parent is submitted.
> > > + *
> > > + * If @dev has any suppliers (as reflected by device links to them), and @status
> > > + * is RPM_ACTIVE, they will be activated upfront and if the activation of one
> > > + * of them fails, the status of @dev will be changed to RPM_SUSPENDED (instead
> > > + * of the @status value) and the suppliers will be deacticated on exit.  The
> > > + * error returned by the failing supplier activation will be returned in that
> > > + * case.
> > >   */
> > >  int __pm_runtime_set_status(struct device *dev, unsigned int status)
> > >  {
> > >         struct device *parent = dev->parent;
> > > -       unsigned long flags;
> > >         bool notify_parent = false;
> > >         int error = 0;
> > >
> > >         if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> > >                 return -EINVAL;
> > >
> > > -       spin_lock_irqsave(&dev->power.lock, flags);
> > > +       /*
> > > +        * If the new status is RPM_ACTIVE, the suppliers can be activated
> > > +        * upfront regardless of the current status, because next time
> > > +        * rpm_put_suppliers() runs, the rpm_active refcounts of the links
> > > +        * involved will be dropped down to one anyway.
> > > +        */
> > > +       if (status == RPM_ACTIVE) {
> > > +               int idx = device_links_read_lock();
> > > +
> > > +               error = rpm_get_suppliers(dev);
> > > +               if (error)
> > > +                       status = RPM_SUSPENDED;
> > > +
> > > +               device_links_read_unlock(idx);
> > > +       }
> >
> > This doesn't look right to me, and more importantly, this isn't
> > consistent with how we treat a parent/child.
>
> It cannot be entirely consistent with that, because you cannot walk
> the suppliers under the device's power.lock.
>
> The idea here is that activating suppliers upfront if the new status
> is RPM_ACTIVE shouldn't hurt regardless.

I see. However, perhaps we can just read out the needed flags/states
(within device's power.lock) before walking the suppliers.

In principle, those flags/states shouldn't really change, in case
runtime PM have been properly disabled by the caller.

>
> > More precisely, I think you need to check "if
> > (!dev->power.runtime_error && !dev->power.disable_depth)" and also
> > whether "dev->power.runtime_status == status", before deciding to call
> > rpm_get_suppliers() above. Otherwise you may end up resuming suppliers
> > and/or increasing the link->rpm_active count, when you shouldn't.
>
> Resuming suppliers unnecessarily is not particularly efficient, but it
> is not incorrect.  Incrementing their rpm_active temporarily also
> isn't incorrect as long as the rpm_active values are correct on exit
> (and note that incementing them if the consumer's status is RPM_ACTIVE
> doesn't even matter).
>
> > In other words, expecting __pm_runtime_set_status() to be called in
> > "balanced" manner isn't correct.
>
> There is no such expectation here.

You are right!

I didn't realize that rpm_put_suppliers() actually doesn't drop the
usage count only by one, but instead as many times as needed to let
rpm_active reach one.

>
> There is a possible race between __pm_runtime_set_status() and runtime
> suspend or resume of the device in case PM-runtime is enabled for it
> when __pm_runtime_set_status() is called, but it shouldn't occur if
> __pm_runtime_set_status() is used correctly (that is, when PM-runtime
> is disabled for the device).
>
> I think I know how to avoid that race, though, so I'm going to post an
> incremental fix if that works out.

Okay, let's see what comes out of this.

Kind regards
Uffe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ