[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZoydV7vad5JWIcZb@ghost>
Date: Mon, 8 Jul 2024 19:15:51 -0700
From: Charlie Jenkins <charlie@...osinc.com>
To: Anup Patel <apatel@...tanamicro.com>
Cc: Emil Renner Berthing <emil.renner.berthing@...onical.com>,
Anup Patel <anup@...infault.org>,
Palmer Dabbelt <palmer@...belt.com>,
Paul Walmsley <paul.walmsley@...ive.com>,
Thomas Gleixner <tglx@...utronix.de>,
Rob Herring <robh+dt@...nel.org>,
Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
Frank Rowand <frowand.list@...il.com>,
Conor Dooley <conor+dt@...nel.org>,
Samuel Holland <samuel@...lland.org>, devicetree@...r.kernel.org,
Saravana Kannan <saravanak@...gle.com>,
Marc Zyngier <maz@...nel.org>, linux-kernel@...r.kernel.org,
Björn Töpel <bjorn@...nel.org>,
Atish Patra <atishp@...shpatra.org>,
linux-riscv@...ts.infradead.org,
linux-arm-kernel@...ts.infradead.org,
Andrew Jones <ajones@...tanamicro.com>
Subject: Re: [PATCH v14 01/18] irqchip/sifive-plic: Convert PLIC driver into
a platform driver
On Thu, Jun 20, 2024 at 08:38:09PM +0530, Anup Patel wrote:
> On Thu, Jun 20, 2024 at 6:40 PM Emil Renner Berthing
> <emil.renner.berthing@...onical.com> wrote:
> >
> > Anup Patel wrote:
> > > On Wed, Jun 19, 2024 at 11:16 PM Emil Renner Berthing
> > > <emil.renner.berthing@...onical.com> wrote:
> > > >
> > > > Anup Patel wrote:
> > > > > On Tue, Jun 18, 2024 at 7:00 PM Emil Renner Berthing
> > > > > <emil.renner.berthing@...onical.com> wrote:
> > > > > >
> > > > > > Anup Patel wrote:
> > > > > > > The PLIC driver does not require very early initialization so convert
> > > > > > > it into a platform driver.
> > > > > > >
> > > > > > > After conversion, the PLIC driver is probed after CPUs are brought-up
> > > > > > > so setup cpuhp state after context handler of all online CPUs are
> > > > > > > initialized otherwise PLIC driver crashes for platforms with multiple
> > > > > > > PLIC instances.
> > > > > > >
> > > > > > > Signed-off-by: Anup Patel <apatel@...tanamicro.com>
> > > > > >
> > > > > > Hi Anup,
> > > > > >
> > > > > > Sorry for the late reply to the mailing list, but ever since 6.9 where this was
> > > > > > applied my Allwinner D1 based boards no longer boot. This is the log of my
> > > > > > LicheeRV Dock booting plain 6.10-rc4, locking up and then rebooting due to the
> > > > > > the watchdog timing out:
> > > > > >
> > > > > > https://pastebin.com/raw/nsbzgEKW
> > > > > >
> > > > > > On 6.10-rc4 I can bring the same board to boot by reverting this patch and all
> > > > > > patches building on it. Eg.:
> > > > > >
> > > > > > git revert e306a894bd51 a7fb69ffd7ce abb720579490 \
> > > > > > 956521064780 a15587277a24 6c725f33d67b \
> > > > > > b68d0ff529a9 25d862e183d4 8ec99b033147
> > > > >
> > > > > Does your board boot with only SBI timer driver enabled ?
> > > >
> > > > I'm not 100% sure this is what you mean, but with this change I can disable
> > > > CONFIG_SUN4I_TIMER:
> > > >
> > > > diff --git a/arch/riscv/Kconfig.socs b/arch/riscv/Kconfig.socs
> > > > index f51bb24bc84c..0143545348eb 100644
> > > > --- a/arch/riscv/Kconfig.socs
> > > > +++ b/arch/riscv/Kconfig.socs
> > > > @@ -39,7 +39,6 @@ config ARCH_SUNXI
> > > > bool "Allwinner sun20i SoCs"
> > > > depends on MMU && !XIP_KERNEL
> > > > select ERRATA_THEAD
> > > > - select SUN4I_TIMER
> > > > help
> > > > This enables support for Allwinner sun20i platform hardware,
> > > > including boards based on the D1 and D1s SoCs.
> > > >
> > > >
> > > > But unfortunately the board still doesn't boot:
> > > > https://pastebin.com/raw/AwRxcfeu
> > >
> > > I think we should enable debug prints in DD core and see
> > > which device is not getting probed due to lack of a provider.
> > >
> > > Just add "#define DEBUG" at the top in drivers/base/core.c
> > > and boot again with "loglevel=8" kernel parameter (along with
> > > the above change).
> >
> > With the above changes this is what I get:
> > https://pastebin.com/raw/JfRrEahT
>
> You should see prints like below which show producer consumer
> relation:
>
> [ 0.214589] /soc/rtc@...000 Linked as a fwnode consumer to /soc/plic@...0000
> [ 0.214966] /soc/serial@...00000 Linked as a fwnode consumer to
> /soc/plic@...0000
> [ 0.215443] /soc/virtio_mmio@...08000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [ 0.216041] /soc/virtio_mmio@...07000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [ 0.216482] /soc/virtio_mmio@...06000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [ 0.216868] /soc/virtio_mmio@...05000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [ 0.217477] /soc/virtio_mmio@...04000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [ 0.217949] /soc/virtio_mmio@...03000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [ 0.218595] /soc/virtio_mmio@...02000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [ 0.219280] /soc/virtio_mmio@...01000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [ 0.219908] /soc/plic@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [ 0.220800] /soc/plic@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [ 0.221323] /soc/plic@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [ 0.221838] /soc/plic@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [ 0.222347] /soc/clint@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [ 0.222769] /soc/clint@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [ 0.223864] /soc/clint@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [ 0.224370] /soc/clint@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [ 0.225217] /soc/pci@...00000 Linked as a fwnode consumer to
> /soc/plic@...0000
>
> To get further prints, I suggest enabling SBI_HVC console and use
> "console=hvc0" as kernel parameter.
>
> Regards,
> Anup
I did some follow-up research into this. The hanging after "cpuidle:
using governor menu" is due to being stuck inside of
check_unaligned_access(). Specifically, there is a check that appears to
be waiting for jiffies to start ticking, but they never do:
while ((now = jiffies) == start_jiffies)
cpu_relax();
`jiffies` is fixed at 0xfffedb08, effectively making this a while(true)
loop. This happens with and without SUN4I_TIMER.
This hang unfortunately happens before the "Linked as a fwnode consumer"
print statements start.
After bypassing this with the configs
CONFIG_NONPORTABLE=y
CONFIG_RISCV_EFFICIENT_UNALIGNED_ACCESS=y
A new warning is tripped:
[ 1.015134] No max_rate, ignoring min_rate of clock 9 - pll-video0
[ 1.021322] WARNING: CPU: 0 PID: 1 at drivers/clk/sunxi-ng/ccu_common.c:155 sunxi_ccu_probe+0x144/0x1a2
[ 1.021351] Modules linked in:
[ 1.021360] CPU: 0 PID: 1 Comm: swapper Tainted: G W 6.10.0-rc6 #1
[ 1.021372] Hardware name: Allwinner D1 Nezha (changed) (DT)
[ 1.021377] epc : sunxi_ccu_probe+0x144/0x1a2
[ 1.021386] ra : sunxi_ccu_probe+0x144/0x1a2
[ 1.021397] epc : ffffffff80405a50 ra : ffffffff80405a50 sp : ffffffc80000bb80
[ 1.021406] gp : ffffffff815f69c8 tp : ffffffd801df8000 t0 : 6100000000000000
[ 1.021414] t1 : 000000000000004e t2 : 61725f78616d206f s0 : ffffffc80000bbe0
[ 1.021422] s1 : ffffffff81537498 a0 : 0000000000000036 a1 : 000000000000054b
[ 1.021430] a2 : 00000000ffffefff a3 : 0000000000000000 a4 : ffffffff8141f628
[ 1.021438] a5 : 0000000000000000 a6 : 0000000000000000 a7 : 000000004442434e
[ 1.021446] s2 : 0000000000000009 s3 : 0000000000000000 s4 : ffffffd801dc9010
[ 1.021453] s5 : ffffffd802428a00 s6 : ffffffd83ffdcf20 s7 : ffffffc800015000
[ 1.021462] s8 : ffffffff80e55360 s9 : ffffffff81034598 s10: 0000000000000000
[ 1.021470] s11: 0000000000000000 t3 : ffffffff8160a257 t4 : ffffffff8160a257
[ 1.021478] t5 : ffffffff8160a258 t6 : ffffffc80000b990
[ 1.021485] status: 0000000200000120 badaddr: 0000000000000000 cause: 0000000000000003
[ 1.021493] [<ffffffff80405a50>] sunxi_ccu_probe+0x144/0x1a2
[ 1.021510] [<ffffffff80405af6>] devm_sunxi_ccu_probe+0x48/0x82
[ 1.021524] [<ffffffff80409020>] sun20i_d1_ccu_probe+0xba/0xfa
[ 1.021546] [<ffffffff804a8b40>] platform_probe+0x4e/0xa6
[ 1.021562] [<ffffffff808d81ee>] really_probe+0x10a/0x2dc
[ 1.021581] [<ffffffff808d8472>] __driver_probe_device.part.0+0xb2/0xe8
[ 1.021597] [<ffffffff804a67aa>] driver_probe_device+0x7a/0xca
[ 1.021621] [<ffffffff804a6912>] __driver_attach+0x52/0x164
[ 1.021638] [<ffffffff804a4c7a>] bus_for_each_dev+0x56/0x8c
[ 1.021656] [<ffffffff804a6382>] driver_attach+0x1a/0x22
[ 1.021673] [<ffffffff804a5c18>] bus_add_driver+0xea/0x1d8
[ 1.021690] [<ffffffff804a7852>] driver_register+0x3e/0xd8
[ 1.021709] [<ffffffff804a8826>] __platform_driver_register+0x1c/0x24
Emil[ 1.021725] [<ffffffff80a17488>] sun20i_d1_ccu_driver_init+0x1a/0x22
[ 1.021746] [<ffffffff800026ae>] do_one_initcall+0x46/0x1be
[ 1.021762] [<ffffffff80a00ef2>] kernel_init_freeable+0x1c6/0x220
[ 1.021791] [<ffffffff808e0b46>] kernel_init+0x1e/0x112
Linked as a fwnode consumer[ 1.021807] [<ffffffff808e7632>] ret_from_fork+0xe/0x1c
The warning is not fatal, so execution continues until hanging at
[ 2.110919] printk: legacy console [ttyS0] disabled
[ 2.136911] 2500000.serial: ttyS0 at MMIO 0x2500000 (irq = 205, base_baud = 1500000) is a 16550A�[ 2.145674] printk: legacy console [ttyS0] enabled
[ 2.145674] printk: legacy console [ttyS0] enabled
[ 2.155095] printk: legacy bootconsole [sbi0] disabled
[ 2.155095] printk: legacy bootconsole [sbi0] disabled
I have not been able to discover why it hangs here.
The clock is somehow relying on the previous behavior of this PLIC
driver.
- Charlie
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@...ts.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
Powered by blists - more mailing lists