lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0iWCWdjP9Gku2uV0Qz-Hp6ZEHDspaPVzHPPzHfvyREeWA@mail.gmail.com>
Date: Mon, 2 Dec 2024 22:14:57 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Saravana Kannan <saravanak@...gle.com>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Pavel Machek <pavel@....cz>, Len Brown <len.brown@...el.com>, 
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Ingo Molnar <mingo@...hat.com>, 
	Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>, 
	Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>, 
	Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Valentin Schneider <vschneid@...hat.com>, Geert Uytterhoeven <geert@...ux-m68k.org>, Marek Vasut <marex@...x.de>, 
	Bird@...gle.com, Tim <Tim.Bird@...y.com>, kernel-team@...roid.com, 
	linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 2/5] PM: sleep: Remove unnecessary mutex lock when
 waiting on parent

On Mon, Dec 2, 2024 at 9:46 PM Saravana Kannan <saravanak@...gle.com> wrote:
>
> On Mon, Dec 2, 2024 at 12:16 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
> >
> > On Mon, Dec 2, 2024 at 9:11 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
> > >
> > > Sorry for the delay.
> > >
> > > On Thu, Nov 14, 2024 at 11:09 PM Saravana Kannan <saravanak@...gle.com> wrote:
> > > >
> > > > Locking is not needed to do get_device(dev->parent). We either get a NULL
> > > > (if the parent was cleared) or the actual parent. Also, when a device is
> > > > deleted (device_del()) and removed from the dpm_list, its completion
> > > > variable is also complete_all()-ed. So, we don't have to worry about
> > > > waiting indefinitely on a deleted parent device.
> > >
> > > The device_pm_initialized(dev) check before get_device(dev->parent)
> > > doesn't make sense without the locking and that's the whole point of
> > > it.
> >
> > Hmm, not really.
> >
> > How is the parent prevented from going away in get_device() right
> > after the initial dev check without the locking?
>
> Not sure what you mean by "go away"? But get_device() is going to keep
> a non-zero refcount on the parent so that struct doesn't get freed.

That's after it has returned.

There is nothing to prevent dev from being freed in get_device()
itself before the kobject_get(&dev->kobj) call.

So when get_device() is called, there needs to be an active ref on the
device already or something else to prevent the target device object
from being freed.

In this particular case it is the lock along with the
device_pm_initialized(dev) check on the child.

> The parent itself can still "go away" in terms of unbinding or
> removing the children from the dpm_list(). But that's what the
> device_pm_initialized() check is for. When a device_del() is called,
> it's removed from the dpm_list. The actual freeing comes later. But we
> aren't/weren't checking for that anyway.
>
> >
> > > > Signed-off-by: Saravana Kannan <saravanak@...gle.com>
> > > > ---
> > > >  drivers/base/power/main.c | 13 ++-----------
> > > >  1 file changed, 2 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> > > > index 86e51b9fefab..9b9b6088e56a 100644
> > > > --- a/drivers/base/power/main.c
> > > > +++ b/drivers/base/power/main.c
> > > > @@ -284,18 +284,9 @@ static bool dpm_wait_for_superior(struct device *dev, bool async)
> > > >          * counting the parent once more unless the device has been deleted
> > > >          * already (in which case return right away).
> > > >          */
> > > > -       mutex_lock(&dpm_list_mtx);
> > > > -
> > > > -       if (!device_pm_initialized(dev)) {
> > > > -               mutex_unlock(&dpm_list_mtx);
> > > > -               return false;
> > > > -       }
> > > > -
> > > >         parent = get_device(dev->parent);
> > > > -
> > > > -       mutex_unlock(&dpm_list_mtx);
> > > > -
> > > > -       dpm_wait(parent, async);
> > > > +       if (device_pm_initialized(dev))
> > > > +               dpm_wait(parent, async);
> > >
> > > This is racy, so what's the point?
> > >
> > > You can just do
> > >
> > > parent = get_device(dev->parent);
> > > dpm_wait(parent, async);
>
> Parent struct device being there isn't the same as whether this device
> is in the dpm_list? So, shouldn't we still check this?
>
> Also, is it really racy anymore with the new algorithm? We don't kick
> off the subordinates until after the parent is done.
>
> > >
> > > and please update the comment above this.
>
> Will do.
>
> Thanks,
> Saravana
> > >
> > > >         put_device(parent);
> > > >
> > > >         dpm_wait_for_suppliers(dev, async);
> > > > --

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ