[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGETcx8cb4JSNMqRXfq2zAMjky_033wu1Ygrb1YqAL=a8voyRA@mail.gmail.com>
Date: Mon, 27 Feb 2023 14:00:51 -0800
From: Saravana Kannan <saravanak@...gle.com>
To: Linus Walleij <linus.walleij@...aro.org>
Cc: Greg KH <gregkh@...uxfoundation.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Ulf Hansson <ulf.hansson@...aro.org>,
Android Kernel Team <kernel-team@...roid.com>
Subject: Re: Regression in probing some AMBA devices possibly devlink related
On Mon, Feb 27, 2023 at 1:07 PM Linus Walleij <linus.walleij@...aro.org> wrote:
>
> On Sun, Feb 26, 2023 at 10:55 PM Linus Walleij <linus.walleij@...aro.org> wrote:
> > On Sun, Feb 26, 2023 at 12:56 AM Saravana Kannan <saravanak@...gle.com> wrote:
> > > On Sat, Feb 25, 2023 at 6:29 AM Linus Walleij <linus.walleij@...aro.org> wrote:
> > > >
> > > > Hi Saravana,
> > > >
> > > > I have a boot regression for Ux500 on mainline, but bisecting mainline
> > > > isn't quite working for misc reasons :/
> > > >
> > > > I'm not sure about this regression, but it smells like devlink-related.
> > > >
> > > > Ux500 have a bunch of normal and some AMBA devices. After
> > > > boot this happens and we hang waiting for rootfs (which is on eMMC):
> > >
> > > Hmmm... my recent changes were definitely tested on systems with AMBA
> > > devices and it worked.
> >
> > Some of them work, such as the TWD, interrupt controller GIC
> > and the CoreSight stuff.
> >
> > > > [ 31.849586] amba 80126000.mmc: deferred probe pending
> > > > [ 31.854801] amba 80118000.mmc: deferred probe pending
> > > > [ 31.859895] amba 80005000.mmc: deferred probe pending
> > > > [ 31.870297] amba 80120000.uart: deferred probe pending
> > > > [ 31.875472] amba 80121000.uart: deferred probe pending
> > > > [ 31.880689] amba 80004000.i2c: deferred probe pending
> > > > [ 31.885799] amba 80128000.i2c: deferred probe pending
> > > > [ 31.890932] amba 80110000.i2c: deferred probe pending
> > > > [ 51.688361] vmem_3v3: disabling
> > >
> > > What does /debug/devices_deferred say about these? That should tell us
> > > exactly what's blocking it.
> >
> > Can't get to check that sadly, because the root fs is on eMMC and
> > that one is deferring indefinitely.
> >
> > I tried to compile in an initramfs but that became too big for this
> > target. I'll check if I can use that on another target which I
> > can boot directly from RAM.
> >
> > > Also if you convert all the pr_debug and dev_dbg in
> > > drivers/base/core.c that should give us enough of an idea of what's
> > > happening. Can you do that too and send it as an attachment (I logs
> > > are hard to read when they get word wrapped)?
> >
> > This I could do!
> >
> > See attachment.
> >
> > It seems these devices are waiting for the PRCC reset-controller
> > which is in
> > drivers/clk/ux500/reset-prcc.c
> >
> > Very well! I went into
> > arch/arm/boot/dts/ste-dbx5x0.dtsi
> > and commented out all "resets = < &prcc....>; statements,
> > and indeed then the system comes up fine! (The resets are
> > not vital to boot the system.)
> >
> > But this is weird because the driver for the resets is spawn from
> > the same device node that is spawning the clocks...
> > All resources (clocks and resets) are spawn from here:
> > drivers/clk/ux500/u8500_of_clk.c
> >
> > From these nodes in the DTS
> > arch/arm/boot/dts/ste-dbx5x0.dtsi:
> >
> > clocks {
> > compatible = "stericsson,u8500-clks";
> > /*
> > * Registers for the CLKRST block on peripheral
> > * groups 1, 2, 3, 5, 6,
> > */
> > reg = <0x8012f000 0x1000>, <0x8011f000 0x1000>,
> > <0x8000f000 0x1000>, <0xa03ff000 0x1000>,
> > <0xa03cf000 0x1000>;
> >
> > prcmu_clk: prcmu-clock {
> > #clock-cells = <1>;
> > };
> >
> > prcc_pclk: prcc-periph-clock {
> > #clock-cells = <2>;
> > };
> >
> > prcc_kclk: prcc-kernel-clock {
> > #clock-cells = <2>;
> > };
> >
> > prcc_reset: prcc-reset-controller {
> > #reset-cells = <2>;
> > };
> >
> > rtc_clk: rtc32k-clock {
> > #clock-cells = <0>;
> > };
> >
> > smp_twd_clk: smp-twd-clock {
> > #clock-cells = <0>;
> > };
> >
> > clkout_clk: clkout-clock {
> > /* Cell 1 id, cell 2 source, cell 3 div */
> > #clock-cells = <3>;
> > };
> > };
> >
> > So it seems like the device core things that the
> > resets are never coming online. But they are! I put some
> > debug prints into the PRCC reset driver to make sure.
> >
> > diff --git a/drivers/clk/ux500/reset-prcc.c b/drivers/clk/ux500/reset-prcc.c
> > index f7e48941fbc7..b1ed9a48cdfb 100644
> > --- a/drivers/clk/ux500/reset-prcc.c
> > +++ b/drivers/clk/ux500/reset-prcc.c
> > @@ -178,4 +178,6 @@ void u8500_prcc_reset_init(struct device_node *np,
> > struct u8500_prcc_reset *ur)
> > ret = reset_controller_register(rcdev);
> > if (ret)
> > pr_err("PRCC failed to register reset controller\n");
> > +
> > + pr_info("PRCC reset controller registered\n");
> > }
> >
> > [ 0.000000] PRCC reset controller registered
> >
> > It is registered as early as the clocks.
> >
> > So it seems something is fishy with the reset resource resolution?
> >
> > > > The last regulator (vmem_3v3) is something the eMMC that didn't
> > > > probe was supposed to use.
> > > >
> > > > All the failing drivers are AMBA PrimeCell devices:
> > > > drivers/mmc/host/mmci.c
> > > > drivers/tty/serial/amba-pl011.c
> > > > drivers/i2c/busses/i2c-nomadik.c
> > > >
> > > > This makes me suspect something was done for ordinary (platform)
> > > > devices that didn't happen for AMBA devices?
> > > >
> > > > This is the main portion of the device tree containing these
> > > > devices and their resources:
> > > > arch/arm/boot/dts/ste-dbx5x0.dtsi
> > >
> > > I'll take a closer look on Monday. Btw, I always prefer the dts file
> > > because there's always some property that adds a dependency and that
> > > might be the clue to whatever is broken. But I'll take a look at this.
> >
> > The top level DTS for this system I'm testing on (source for the
> > log) is
> > arch/arm/boot/dts/ste-ux500-samsung-golden.dts
> >
> > > It's unlikely the patch attached might fix it, but can you give it a shot?
> >
> > Sadly does not help!
>
> Did this mail go through with the attachment?
Attachment came through for me. I should be able to give you a test
patch this week. My guess on what the issue is: I have to walk up to
the parents and check for FWNODE_FLAG_DEV_INITIALIZED or set it
recursively for all the child nodes (termination condition being when
the fwnode has a device).
-Saravana
>
> I have pastebin:ed it here as well:
> https://pastebin.com/Fwc7E1Sm
>
> Yours,
> Linus Walleij
Powered by blists - more mailing lists