lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 3 Oct 2020 03:13:42 +0300
From:   Laurent Pinchart <laurent.pinchart@...asonboard.com>
To:     Saravana Kannan <saravanak@...gle.com>
Cc:     Grygorii Strashko <grygorii.strashko@...com>,
        Rob Herring <robh+dt@...nel.org>,
        Frank Rowand <frowand.list@...il.com>,
        Geert Uytterhoeven <geert+renesas@...der.be>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-omap <linux-omap@...r.kernel.org>,
        "open list:THERMAL" <linux-pm@...r.kernel.org>,
        Peter Ujfalusi <peter.ujfalusi@...com>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Tomi Valkeinen <tomi.valkeinen@...com>,
        Tony Lindgren <tony@...mide.com>,
        Ulf Hansson <ulf.hansson@...aro.org>,
        Android Kernel Team <kernel-team@...roid.com>,
        "open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS" 
        <devicetree@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1] of: platform: Batch fwnode parsing in the
 init_machine() path

Hi Saravana,

On Fri, Oct 02, 2020 at 12:56:30PM -0700, Saravana Kannan wrote:
> On Fri, Oct 2, 2020 at 11:35 AM 'Grygorii Strashko' via kernel-team wrote:
> > On 02/10/2020 21:27, Laurent Pinchart wrote:
> > > On Fri, Oct 02, 2020 at 10:58:55AM -0700, Saravana Kannan wrote:
> > >> On Fri, Oct 2, 2020 at 10:55 AM Laurent Pinchart wrote:
> > >>> On Fri, Oct 02, 2020 at 10:51:51AM -0700, Saravana Kannan wrote:
> > >>>> On Fri, Oct 2, 2020 at 7:08 AM Rob Herring <robh+dt@...nel.org> wrote:
> > >>>>> On Thu, Oct 1, 2020 at 5:59 PM Saravana Kannan <saravanak@...gle.com> wrote:
> > >>>>>>
> > >>>>>> When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when
> > >>>>>> adding all top level devices") optimized the fwnode parsing when all top
> > >>>>>> level devices are added, it missed out optimizing this for platform
> > >>>>>> where the top level devices are added through the init_machine() path.
> > >>>>>>
> > >>>>>> This commit does the optimization for all paths by simply moving the
> > >>>>>> fw_devlink_pause/resume() inside of_platform_default_populate().
> > >>>>>>
> > >>>>>> Reported-by: Tomi Valkeinen <tomi.valkeinen@...com>
> > >>>>>> Signed-off-by: Saravana Kannan <saravanak@...gle.com>
> > >>>>>> ---
> > >>>>>>   drivers/of/platform.c | 19 +++++++++++++++----
> > >>>>>>   1 file changed, 15 insertions(+), 4 deletions(-)
> > >>>>>>
> > >>>>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> > >>>>>> index 071f04da32c8..79972e49b539 100644
> > >>>>>> --- a/drivers/of/platform.c
> > >>>>>> +++ b/drivers/of/platform.c
> > >>>>>> @@ -501,8 +501,21 @@ int of_platform_default_populate(struct device_node *root,
> > >>>>>>                                   const struct of_dev_auxdata *lookup,
> > >>>>>>                                   struct device *parent)
> > >>>>>>   {
> > >>>>>> -       return of_platform_populate(root, of_default_bus_match_table, lookup,
> > >>>>>> -                                   parent);
> > >>>>>> +       int ret;
> > >>>>>> +
> > >>>>>> +       /*
> > >>>>>> +        * fw_devlink_pause/resume() are only safe to be called around top
> > >>>>>> +        * level device addition due to locking constraints.
> > >>>>>> +        */
> > >>>>>> +       if (!root)
> > >>>>>> +               fw_devlink_pause();
> > >>>>>> +
> > >>>>>> +       ret = of_platform_populate(root, of_default_bus_match_table, lookup,
> > >>>>>> +                                  parent);
> > >>>>>
> > >>>>> of_platform_default_populate() vs. of_platform_populate() is just a
> > >>>>> different match table. I don't think the behavior should otherwise be
> > >>>>> different.
> > >>>>>
> > >>>>> There's also of_platform_probe() which has slightly different matching
> > >>>>> behavior. It should not behave differently either with respect to
> > >>>>> devlinks.
> > >>>>
> > >>>> So I'm trying to do this only when the top level devices are added for
> > >>>> the first time. of_platform_default_populate() seems to be the most
> > >>>> common path. For other cases, I think we just need to call
> > >>>> fw_devlink_pause/resume() wherever the top level devices are added for
> > >>>> the first time. As I said in the other email, we can't add
> > >>>> fw_devlink_pause/resume() by default to of_platform_populate().
> > >>>>
> > >>>> Do you have other ideas for achieving "call fw_devlink_pause/resume()
> > >>>> only when top level devices are added for the first time"?
> > >>>
> > >>> I'm not an expert in this domain, but before investigating it, would you
> > >>> be able to share a hack patch that implements this (in the most simple
> > >>> way) to check if it actually fixes the delays I experience on my system
> > >>> ?
> > >>
> > >> So I take it the patch I sent out didn't work for you? Can you tell me
> > >> what machine/DT you are using?
> > >
> > > I've replied to the patch:
> > >
> > > Based on v5.9-rc5, before the patch:
> > >
> > > [    0.652887] cpuidle: using governor menu
> > > [   12.349476] No ATAGs?
> > >
> > > After the patch:
> > >
> > > [    0.650460] cpuidle: using governor menu
> > > [   12.262101] No ATAGs?
> > >
> > > I'm using an AM57xx EVM, whose DT is not upstream, but it's essentially
> > > a am57xx-beagle-x15-revb1.dts (it includes that DTS) with a few
> > > additional nodes for GPIO keys, LCD panel, backlight and touchscreen.
> > >
> >
> > hope you are receiving my mails as I've provided you with all required information already [1]
> 
> Laurent/Grygorii,
> 
> Looks like I'm definitely missing emails. Sorry about the confusion.
> 
> I have some other urgent things on my plate right now. Is it okay if I
> get to this in a day or two? In the end, we'll find a solution that
> addresses most/all of the delay.

No issue on my side.

By the way, during initial investigations, I've traced code paths to
figure out if there was a particular step that would consume a large
amount of time, and found out that of_platform_populate() ends up
executing devlink-related code that seems to have an O(n^3) complexity
on the number of devices, with a few dozens of milliseconds for each
iteration. That's a very bad complexity.

-- 
Regards,

Laurent Pinchart

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ