lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGETcx-0bStPx8sF3BtcJFiu74NwiB0btTQ+xx_B=8B37TEb8w@mail.gmail.com>
Date:   Fri, 1 Jul 2022 01:10:48 -0700
From:   Saravana Kannan <saravanak@...gle.com>
To:     Tony Lindgren <tony@...mide.com>
Cc:     Rob Herring <robh@...nel.org>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Kevin Hilman <khilman@...nel.org>,
        Ulf Hansson <ulf.hansson@...aro.org>,
        Len Brown <len.brown@...el.com>, Pavel Machek <pavel@....cz>,
        Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
        Andrew Lunn <andrew@...n.ch>,
        Heiner Kallweit <hkallweit1@...il.com>,
        Russell King <linux@...linux.org.uk>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Linus Walleij <linus.walleij@...aro.org>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        David Ahern <dsahern@...nel.org>,
        Android Kernel Team <kernel-team@...roid.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "open list:THERMAL" <linux-pm@...r.kernel.org>,
        Linux IOMMU <iommu@...ts.linux-foundation.org>,
        netdev <netdev@...r.kernel.org>,
        "open list:GPIO SUBSYSTEM" <linux-gpio@...r.kernel.org>,
        Alexander Stein <alexander.stein@...tq-group.com>
Subject: Re: [PATCH v2 1/9] PM: domains: Delete usage of driver_deferred_probe_check_state()

On Thu, Jun 30, 2022 at 11:12 PM Tony Lindgren <tony@...mide.com> wrote:
>
> * Tony Lindgren <tony@...mide.com> [220701 08:33]:
> > * Saravana Kannan <saravanak@...gle.com> [220630 23:25]:
> > > On Thu, Jun 30, 2022 at 4:26 PM Rob Herring <robh@...nel.org> wrote:
> > > >
> > > > On Thu, Jun 30, 2022 at 5:11 PM Saravana Kannan <saravanak@...gle.com> wrote:
> > > > >
> > > > > On Mon, Jun 27, 2022 at 2:10 AM Tony Lindgren <tony@...mide.com> wrote:
> > > > > >
> > > > > > * Saravana Kannan <saravanak@...gle.com> [220623 08:17]:
> > > > > > > On Thu, Jun 23, 2022 at 12:01 AM Tony Lindgren <tony@...mide.com> wrote:
> > > > > > > >
> > > > > > > > * Saravana Kannan <saravanak@...gle.com> [220622 19:05]:
> > > > > > > > > On Tue, Jun 21, 2022 at 9:59 PM Tony Lindgren <tony@...mide.com> wrote:
> > > > > > > > > > This issue is no directly related fw_devlink. It is a side effect of
> > > > > > > > > > removing driver_deferred_probe_check_state(). We no longer return
> > > > > > > > > > -EPROBE_DEFER at the end of driver_deferred_probe_check_state().
> > > > > > > > >
> > > > > > > > > Yes, I understand the issue. But driver_deferred_probe_check_state()
> > > > > > > > > was deleted because fw_devlink=on should have short circuited the
> > > > > > > > > probe attempt with an  -EPROBE_DEFER before reaching the bus/driver
> > > > > > > > > probe function and hitting this -ENOENT failure. That's why I was
> > > > > > > > > asking the other questions.
> > > > > > > >
> > > > > > > > OK. So where is the -EPROBE_DEFER supposed to happen without
> > > > > > > > driver_deferred_probe_check_state() then?
> > > > > > >
> > > > > > > device_links_check_suppliers() call inside really_probe() would short
> > > > > > > circuit and return an -EPROBE_DEFER if the device links are created as
> > > > > > > expected.
> > > > > >
> > > > > > OK
> > > > > >
> > > > > > > > Hmm so I'm not seeing any supplier for the top level ocp device in
> > > > > > > > the booting case without your patches. I see the suppliers for the
> > > > > > > > ocp child device instances only.
> > > > > > >
> > > > > > > Hmmm... this is strange (that the device link isn't there), but this
> > > > > > > is what I suspected.
> > > > > >
> > > > > > Yup, maybe it's because of the supplier being a device in the child
> > > > > > interconnect for the ocp.
> > > > >
> > > > > Ugh... yeah, this is why the normal (not SYNC_STATE_ONLY) device link
> > > > > isn't being created.
> > > > >
> > > > > So the aggregated view is something like (I had to set tabs = 4 space
> > > > > to fit it within 80 cols):
> > > > >
> > > > >     ocp: ocp {         <========================= Consumer
> > > > >         compatible = "simple-pm-bus";
> > > > >         power-domains = <&prm_per>; <=========== Supplier ref
> > > > >
> > > > >                 l4_wkup: interconnect@...00000 {
> > > > >             compatible = "ti,am33xx-l4-wkup", "simple-pm-bus";
> > > > >
> > > > >             segment@...000 {  /* 0x44e00000 */
> > > > >                 compatible = "simple-pm-bus";
> > > > >
> > > > >                 target-module@0 { /* 0x44e00000, ap 8 58.0 */
> > > > >                     compatible = "ti,sysc-omap4", "ti,sysc";
> > > > >
> > > > >                     prcm: prcm@0 {
> > > > >                         compatible = "ti,am3-prcm", "simple-bus";
> > > > >
> > > > >                         prm_per: prm@c00 { <========= Actual Supplier
> > > > >                             compatible = "ti,am3-prm-inst", "ti,omap-prm-inst";
> > > > >                         };
> > > > >                     };
> > > > >                 };
> > > > >             };
> > > > >         };
> > > > >     };
> > > > >
> > > > > The power-domain supplier is the great-great-great-grand-child of the
> > > > > consumer. It's not clear to me how this is valid. What does it even
> > > > > mean?
> > > > >
> > > > > Rob, is this considered a valid DT?
> > > >
> > > > Valid DT for broken h/w.
> > >
> > > I'm not sure even in that case it's valid. When the parent device is
> > > in reset (when the SoC is coming out of reset), there's no way the
> > > descendant is functional. And if the descendant is not functional, how
> > > is the parent device powered up? This just feels like an incorrect
> > > representation of the real h/w.
> >
> > It should be correct representation based on scanning the interconnects
> > and looking at the documentation. Some interconnect parts are wired
> > always-on and some interconnect instances may be dual-mapped.

Thanks for helping to debug this. Appreciate it.

> >
> > We have a quirk to probe prm/prcm first with pdata_quirks_init_clocks().

:'(

I checked out the code. These prm devices just get populated with NULL
as the parent. So they are effectively top level devices from the
perspective of driver core.

> > Maybe that also now fails in addition to the top level interconnect
> > probing no longer producing -EPROBE_DEFER.

As far as I can tell pdata_quirks_init_clocks() is just adding these
prm devices (amongst other drivers). So I don't expect that to fail.

> >
> > > > So the domain must be default on and then simple-pm-bus is going to
> > > > hold a reference to the domain preventing it from ever getting powered
> > > > off and things seem to work. Except what happens during suspend?
> > >
> > > But how can simple-pm-bus even get a reference? The PM domain can't
> > > get added until we are well into the probe of the simple-pm-bus and
> > > AFAICT the genpd attach is done before the driver probe is even
> > > called.
> >
> > The prm/prcm gets of_platform_populate() called on it early.

:'(

> The hackish patch below makes things boot for me, not convinced this
> is the preferred fix compared to earlier deferred probe handling though.
> Going back to the init level tinkering seems like a step back to me.

The goal of fw_devlink is to avoid init level tinkering and it does
help with that in general. But these kinds of quirks are going to need
a few exceptions -- with them being quirks and all. And this change
will avoid an unnecessary deferred probe (that used to happen even
before my change).

The other option to handle this quirk is to create the invalid
(consumer is parent of supplier) fwnode_link between the prm device
and its consumers when the prm device is populated. Then fw_devlink
will end up creating a device link when ocp gets added. But I'm not
sure if it's going to be easy to find and add all those consumers.

I'd say, for now, let's go with this patch below. I'll see if I can
get fw_devlink to handle these odd quirks without breaking the normal
cases or making them significantly slower. But that'll take some time
and I'm not sure there'll be a nice solution.

Thanks,
Saravana

> Regards,
>
> Tony
>
> 8< ----------------
> diff --git a/drivers/soc/ti/omap_prm.c b/drivers/soc/ti/omap_prm.c
> --- a/drivers/soc/ti/omap_prm.c
> +++ b/drivers/soc/ti/omap_prm.c
> @@ -991,4 +991,9 @@ static struct platform_driver omap_prm_driver = {
>                 .of_match_table = omap_prm_id_table,
>         },
>  };
> -builtin_platform_driver(omap_prm_driver);
> +
> +static int __init omap_prm_init(void)
> +{
> +        return platform_driver_register(&omap_prm_driver);
> +}
> +subsys_initcall(omap_prm_init);
> --
> 2.36.1
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@...roid.com.
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ