lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGETcx8cb4JSNMqRXfq2zAMjky_033wu1Ygrb1YqAL=a8voyRA@mail.gmail.com>
Date:   Mon, 27 Feb 2023 14:00:51 -0800
From:   Saravana Kannan <saravanak@...gle.com>
To:     Linus Walleij <linus.walleij@...aro.org>
Cc:     Greg KH <gregkh@...uxfoundation.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Ulf Hansson <ulf.hansson@...aro.org>,
        Android Kernel Team <kernel-team@...roid.com>
Subject: Re: Regression in probing some AMBA devices possibly devlink related

On Mon, Feb 27, 2023 at 1:07 PM Linus Walleij <linus.walleij@...aro.org> wrote:
>
> On Sun, Feb 26, 2023 at 10:55 PM Linus Walleij <linus.walleij@...aro.org> wrote:
> > On Sun, Feb 26, 2023 at 12:56 AM Saravana Kannan <saravanak@...gle.com> wrote:
> > > On Sat, Feb 25, 2023 at 6:29 AM Linus Walleij <linus.walleij@...aro.org> wrote:
> > > >
> > > > Hi Saravana,
> > > >
> > > > I have a boot regression for Ux500 on mainline, but bisecting mainline
> > > > isn't quite working for misc reasons :/
> > > >
> > > > I'm not sure about this regression, but it smells like devlink-related.
> > > >
> > > > Ux500 have a bunch of normal and some AMBA devices. After
> > > > boot this happens and we hang waiting for rootfs (which is on eMMC):
> > >
> > > Hmmm... my recent changes were definitely tested on systems with AMBA
> > > devices and it worked.
> >
> > Some of them work, such as the TWD, interrupt controller GIC
> > and the CoreSight stuff.
> >
> > > > [   31.849586] amba 80126000.mmc: deferred probe pending
> > > > [   31.854801] amba 80118000.mmc: deferred probe pending
> > > > [   31.859895] amba 80005000.mmc: deferred probe pending
> > > > [   31.870297] amba 80120000.uart: deferred probe pending
> > > > [   31.875472] amba 80121000.uart: deferred probe pending
> > > > [   31.880689] amba 80004000.i2c: deferred probe pending
> > > > [   31.885799] amba 80128000.i2c: deferred probe pending
> > > > [   31.890932] amba 80110000.i2c: deferred probe pending
> > > > [   51.688361] vmem_3v3: disabling
> > >
> > > What does /debug/devices_deferred say about these? That should tell us
> > > exactly what's blocking it.
> >
> > Can't get to check that sadly, because the root fs is on eMMC and
> > that one is deferring indefinitely.
> >
> > I tried to compile in an initramfs but that became too big for this
> > target. I'll check if I can use that on another target which I
> > can boot directly from RAM.
> >
> > > Also if you convert all the pr_debug and dev_dbg in
> > > drivers/base/core.c that should give us enough of an idea of what's
> > > happening. Can you do that too and send it as an attachment (I logs
> > > are hard to read when they get word wrapped)?
> >
> > This I could do!
> >
> > See attachment.
> >
> > It seems these devices are waiting for the PRCC reset-controller
> > which is in
> > drivers/clk/ux500/reset-prcc.c
> >
> > Very well! I went into
> > arch/arm/boot/dts/ste-dbx5x0.dtsi
> > and commented out all "resets = < &prcc....>; statements,
> > and indeed then the system comes up fine! (The resets are
> > not vital to boot the system.)
> >
> > But this is weird because the driver for the resets is spawn from
> > the same device node that is spawning the clocks...
> > All resources (clocks and resets) are spawn from here:
> > drivers/clk/ux500/u8500_of_clk.c
> >
> > From these nodes in the DTS
> > arch/arm/boot/dts/ste-dbx5x0.dtsi:
> >
> >                 clocks {
> >                         compatible = "stericsson,u8500-clks";
> >                         /*
> >                          * Registers for the CLKRST block on peripheral
> >                          * groups 1, 2, 3, 5, 6,
> >                          */
> >                         reg = <0x8012f000 0x1000>, <0x8011f000 0x1000>,
> >                             <0x8000f000 0x1000>, <0xa03ff000 0x1000>,
> >                             <0xa03cf000 0x1000>;
> >
> >                         prcmu_clk: prcmu-clock {
> >                                 #clock-cells = <1>;
> >                         };
> >
> >                         prcc_pclk: prcc-periph-clock {
> >                                 #clock-cells = <2>;
> >                         };
> >
> >                         prcc_kclk: prcc-kernel-clock {
> >                                 #clock-cells = <2>;
> >                         };
> >
> >                         prcc_reset: prcc-reset-controller {
> >                                 #reset-cells = <2>;
> >                         };
> >
> >                         rtc_clk: rtc32k-clock {
> >                                 #clock-cells = <0>;
> >                         };
> >
> >                         smp_twd_clk: smp-twd-clock {
> >                                 #clock-cells = <0>;
> >                         };
> >
> >                         clkout_clk: clkout-clock {
> >                                 /* Cell 1 id, cell 2 source, cell 3 div */
> >                                 #clock-cells = <3>;
> >                         };
> >                 };
> >
> > So it seems like the device core things that the
> > resets are never coming online. But they are! I put some
> > debug prints into the PRCC reset driver to make sure.
> >
> > diff --git a/drivers/clk/ux500/reset-prcc.c b/drivers/clk/ux500/reset-prcc.c
> > index f7e48941fbc7..b1ed9a48cdfb 100644
> > --- a/drivers/clk/ux500/reset-prcc.c
> > +++ b/drivers/clk/ux500/reset-prcc.c
> > @@ -178,4 +178,6 @@ void u8500_prcc_reset_init(struct device_node *np,
> > struct u8500_prcc_reset *ur)
> >         ret = reset_controller_register(rcdev);
> >         if (ret)
> >                 pr_err("PRCC failed to register reset controller\n");
> > +
> > +       pr_info("PRCC reset controller registered\n");
> >  }
> >
> > [    0.000000] PRCC reset controller registered
> >
> > It is registered as early as the clocks.
> >
> > So it seems something is fishy with the reset resource resolution?
> >
> > > > The last regulator (vmem_3v3) is something the eMMC that didn't
> > > > probe was supposed to use.
> > > >
> > > > All the failing drivers are AMBA PrimeCell devices:
> > > > drivers/mmc/host/mmci.c
> > > > drivers/tty/serial/amba-pl011.c
> > > > drivers/i2c/busses/i2c-nomadik.c
> > > >
> > > > This makes me suspect something was done for ordinary (platform)
> > > > devices that didn't happen for AMBA devices?
> > > >
> > > > This is the main portion of the device tree containing these
> > > > devices and their resources:
> > > > arch/arm/boot/dts/ste-dbx5x0.dtsi
> > >
> > > I'll take a closer look on Monday. Btw, I always prefer the dts file
> > > because there's always some property that adds a dependency and that
> > > might be the clue to whatever is broken. But I'll take a look at this.
> >
> > The top level DTS for this system I'm testing on (source for the
> > log) is
> > arch/arm/boot/dts/ste-ux500-samsung-golden.dts
> >
> > > It's unlikely the patch attached might fix it, but can you give it a shot?
> >
> > Sadly does not help!
>
> Did this mail go through with the attachment?

Attachment came through for me. I should be able to give you a test
patch this week. My guess on what the issue is: I have to walk up to
the parents and check for FWNODE_FLAG_DEV_INITIALIZED or set it
recursively for all the child nodes (termination condition being when
the fwnode has a device).

-Saravana

>
> I have pastebin:ed it here as well:
> https://pastebin.com/Fwc7E1Sm
>
> Yours,
> Linus Walleij

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ