lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMuHMdW_g4ODMH6zx0f8KHvy7MnFE=K1zPU=uRk02D9g2UP8Ow@mail.gmail.com>
Date:   Wed, 1 Jul 2020 13:07:16 +0200
From:   Geert Uytterhoeven <geert@...ux-m68k.org>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Saravana Kannan <saravanak@...gle.com>,
        "Cc: Android Kernel" <kernel-team@...roid.com>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1] driver core: Fix suspend/resume order issue with
 deferred probe

Hi Rafael,

On Tue, Jun 30, 2020 at 6:11 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
> On Tue, Jun 30, 2020 at 5:39 PM Greg Kroah-Hartman
> <gregkh@...uxfoundation.org> wrote:
> > On Tue, Jun 30, 2020 at 03:50:58PM +0200, Rafael J. Wysocki wrote:
> > > On Fri, Jun 26, 2020 at 10:53 PM Geert Uytterhoeven
> > > <geert@...ux-m68k.org> wrote:
> > > > On Fri, Jun 26, 2020 at 10:34 PM Saravana Kannan <saravanak@...gle.com> wrote:
> > > > > On Fri, Jun 26, 2020 at 4:27 AM Rafael J. Wysocki <rafael@...nel.org> wrote:
> > > > > > On Thu, Jun 25, 2020 at 7:52 PM Saravana Kannan <saravanak@...gle.com> wrote:
> > > > > > > On Thu, Jun 25, 2020 at 10:47 AM Rafael J. Wysocki <rafael@...nel.org> wrote:
> > > > > > > > Note that deferred probing gets in the way here and so the problem is
> > > > > > > > related to it.
> > > > > > >
> > > > > > > I mean, we officially support deferred probing. Shouldn't we fix it so
> > > > > > > that it doesn't break suspend/resume?
> > > > > >
> > > > > > Yes, we should fix deferred probing.
> > > >
> > > > Please take into account that breakage is an actual regression.
> > > >
> > > > > > > Also, it's pretty easy to have
> > > > > > > cases where one module probes multiple device instances and loading it
> > > > > > > in one order would break dpm_list order for one device and loading it
> > > > > > > in another order would break it for another device. And there would be
> > > > > > > no "proper" order to load modules (because module order != device
> > > > > > > order).
> > > > > >
> > > > > > I'm not saying that the current code is perfect.  I'm saying that the
> > > > > > fix as proposed adds too much cost for everybody who may not care IMO.
> > > > >
> > > > > Ok, how about I don't do this reordering until we see the first
> > > > > deferred probe request? Will that work for you? In that case, systems
> > > > > with no deferred probing will not incur any reordering cost. Or if
> > > > > reordering starts only towards the end, all the previous probes won't
> > > > > incur reordering cost.
> > > >
> > > > That first deferred probe request is more or less as of the first probe,
> > > > since commit 93d2e4322aa74c1a ("of: platform: Batch fwnode parsing when
> > > > adding all top level devices"), at least on DT systems.
> > >
> > > The deferred probe reordering of devices to the end of dpm_list
> > > started in 2012, so it is nothing new, and it demonstrably works for
> > > devices where the dependencies are known to the driver core.
> > >
> > > That said, in the cases when the dependencies are known to the driver
> > > core, it is also unnecessary to reorder dpm_list in
> > > deferred_probe_work_func(), because the right ordering of it is going
> > > to be determined elsewhere.
> > >
> > > Also commit 494fd7b7ad10 ("PM / core: fix deferred probe breaking
> > > suspend resume order") is not the source of the problem here, because
> > > the problem would have still been there without it, due to the
> > > device_pm_move_last() that was there before, so the Fixes: tag
> > > pointing to that commit is misleading.
> > >
> > > Now, because 716a7a259690 ("driver core: fw_devlink: Add support for
> > > batching fwnode parsing") is an optimization and the regression is
> > > present because of it AFAICS, the best way to address it at that point
> > > would be to revert commit 716a7a259690 for 5.8 and maybe do the
> > > optimization more carefully.
> > >
> > > Greg, what do you think?
> >
> > I've been ignoreing this and letting you all sort it out :)
> >
> > But if you think that patch should be reverted, I'll not object and will
> > be glad to to it if this solves the issue.
>
> Well, if Geert can confirm that reverting commit 716a7a259690 makes
> the problem go away, IMO this would be the most reasonable thing to do
> at this stage of the cycle without risking that more regressions will
> be introduced.
>
> Geert?

Reverting commit 716a7a25969003d8 ("driver core: fw_devlink: Add support
for batching fwnode parsing") requires reverting commits
fefcfc968723caf9 ("driver core: Remove check in
driver_deferred_probe_force_trigger()") and 93d2e4322aa74c1a ("of:
platform: Batch fwnode parsing when adding all top level devices"),
too.

While reverting the latter is sufficient to fix the regression for me, I
can confirm that reverting all three fixes the issue, too.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ