lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACRpkdaw9uJHgMVMdXcwu83fV83uo8r5vtYC8UmO6w=MaWaFdw@mail.gmail.com>
Date:   Sun, 26 Feb 2023 22:55:46 +0100
From:   Linus Walleij <linus.walleij@...aro.org>
To:     Saravana Kannan <saravanak@...gle.com>
Cc:     Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Ulf Hansson <ulf.hansson@...aro.org>,
        Android Kernel Team <kernel-team@...roid.com>
Subject: Re: Regression in probing some AMBA devices possibly devlink related

On Sun, Feb 26, 2023 at 12:56 AM Saravana Kannan <saravanak@...gle.com> wrote:
> On Sat, Feb 25, 2023 at 6:29 AM Linus Walleij <linus.walleij@...aro.org> wrote:
> >
> > Hi Saravana,
> >
> > I have a boot regression for Ux500 on mainline, but bisecting mainline
> > isn't quite working for misc reasons :/
> >
> > I'm not sure about this regression, but it smells like devlink-related.
> >
> > Ux500 have a bunch of normal and some AMBA devices. After
> > boot this happens and we hang waiting for rootfs (which is on eMMC):
>
> Hmmm... my recent changes were definitely tested on systems with AMBA
> devices and it worked.

Some of them work, such as the TWD, interrupt controller GIC
and the CoreSight stuff.

> > [   31.849586] amba 80126000.mmc: deferred probe pending
> > [   31.854801] amba 80118000.mmc: deferred probe pending
> > [   31.859895] amba 80005000.mmc: deferred probe pending
> > [   31.870297] amba 80120000.uart: deferred probe pending
> > [   31.875472] amba 80121000.uart: deferred probe pending
> > [   31.880689] amba 80004000.i2c: deferred probe pending
> > [   31.885799] amba 80128000.i2c: deferred probe pending
> > [   31.890932] amba 80110000.i2c: deferred probe pending
> > [   51.688361] vmem_3v3: disabling
>
> What does /debug/devices_deferred say about these? That should tell us
> exactly what's blocking it.

Can't get to check that sadly, because the root fs is on eMMC and
that one is deferring indefinitely.

I tried to compile in an initramfs but that became too big for this
target. I'll check if I can use that on another target which I
can boot directly from RAM.

> Also if you convert all the pr_debug and dev_dbg in
> drivers/base/core.c that should give us enough of an idea of what's
> happening. Can you do that too and send it as an attachment (I logs
> are hard to read when they get word wrapped)?

This I could do!

See attachment.

It seems these devices are waiting for the PRCC reset-controller
which is in
drivers/clk/ux500/reset-prcc.c

Very well! I went into
arch/arm/boot/dts/ste-dbx5x0.dtsi
and commented out all "resets = < &prcc....>; statements,
and indeed then the system comes up fine! (The resets are
not vital to boot the system.)

But this is weird because the driver for the resets is spawn from
the same device node that is spawning the clocks...
All resources (clocks and resets) are spawn from here:
drivers/clk/ux500/u8500_of_clk.c

>From these nodes in the DTS
arch/arm/boot/dts/ste-dbx5x0.dtsi:

                clocks {
                        compatible = "stericsson,u8500-clks";
                        /*
                         * Registers for the CLKRST block on peripheral
                         * groups 1, 2, 3, 5, 6,
                         */
                        reg = <0x8012f000 0x1000>, <0x8011f000 0x1000>,
                            <0x8000f000 0x1000>, <0xa03ff000 0x1000>,
                            <0xa03cf000 0x1000>;

                        prcmu_clk: prcmu-clock {
                                #clock-cells = <1>;
                        };

                        prcc_pclk: prcc-periph-clock {
                                #clock-cells = <2>;
                        };

                        prcc_kclk: prcc-kernel-clock {
                                #clock-cells = <2>;
                        };

                        prcc_reset: prcc-reset-controller {
                                #reset-cells = <2>;
                        };

                        rtc_clk: rtc32k-clock {
                                #clock-cells = <0>;
                        };

                        smp_twd_clk: smp-twd-clock {
                                #clock-cells = <0>;
                        };

                        clkout_clk: clkout-clock {
                                /* Cell 1 id, cell 2 source, cell 3 div */
                                #clock-cells = <3>;
                        };
                };

So it seems like the device core things that the
resets are never coming online. But they are! I put some
debug prints into the PRCC reset driver to make sure.

diff --git a/drivers/clk/ux500/reset-prcc.c b/drivers/clk/ux500/reset-prcc.c
index f7e48941fbc7..b1ed9a48cdfb 100644
--- a/drivers/clk/ux500/reset-prcc.c
+++ b/drivers/clk/ux500/reset-prcc.c
@@ -178,4 +178,6 @@ void u8500_prcc_reset_init(struct device_node *np,
struct u8500_prcc_reset *ur)
        ret = reset_controller_register(rcdev);
        if (ret)
                pr_err("PRCC failed to register reset controller\n");
+
+       pr_info("PRCC reset controller registered\n");
 }

[    0.000000] PRCC reset controller registered

It is registered as early as the clocks.

So it seems something is fishy with the reset resource resolution?

> > The last regulator (vmem_3v3) is something the eMMC that didn't
> > probe was supposed to use.
> >
> > All the failing drivers are AMBA PrimeCell devices:
> > drivers/mmc/host/mmci.c
> > drivers/tty/serial/amba-pl011.c
> > drivers/i2c/busses/i2c-nomadik.c
> >
> > This makes me suspect something was done for ordinary (platform)
> > devices that didn't happen for AMBA devices?
> >
> > This is the main portion of the device tree containing these
> > devices and their resources:
> > arch/arm/boot/dts/ste-dbx5x0.dtsi
>
> I'll take a closer look on Monday. Btw, I always prefer the dts file
> because there's always some property that adds a dependency and that
> might be the clue to whatever is broken. But I'll take a look at this.

The top level DTS for this system I'm testing on (source for the
log) is
arch/arm/boot/dts/ste-ux500-samsung-golden.dts

> It's unlikely the patch attached might fix it, but can you give it a shot?

Sadly does not help!

Yours,
Linus Walleij

View attachment "bar.txt" of type "text/plain" (143710 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ