lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZoydV7vad5JWIcZb@ghost>
Date: Mon, 8 Jul 2024 19:15:51 -0700
From: Charlie Jenkins <charlie@...osinc.com>
To: Anup Patel <apatel@...tanamicro.com>
Cc: Emil Renner Berthing <emil.renner.berthing@...onical.com>,
	Anup Patel <anup@...infault.org>,
	Palmer Dabbelt <palmer@...belt.com>,
	Paul Walmsley <paul.walmsley@...ive.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Rob Herring <robh+dt@...nel.org>,
	Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
	Frank Rowand <frowand.list@...il.com>,
	Conor Dooley <conor+dt@...nel.org>,
	Samuel Holland <samuel@...lland.org>, devicetree@...r.kernel.org,
	Saravana Kannan <saravanak@...gle.com>,
	Marc Zyngier <maz@...nel.org>, linux-kernel@...r.kernel.org,
	Björn Töpel <bjorn@...nel.org>,
	Atish Patra <atishp@...shpatra.org>,
	linux-riscv@...ts.infradead.org,
	linux-arm-kernel@...ts.infradead.org,
	Andrew Jones <ajones@...tanamicro.com>
Subject: Re: [PATCH v14 01/18] irqchip/sifive-plic: Convert PLIC driver into
 a platform driver

On Thu, Jun 20, 2024 at 08:38:09PM +0530, Anup Patel wrote:
> On Thu, Jun 20, 2024 at 6:40 PM Emil Renner Berthing
> <emil.renner.berthing@...onical.com> wrote:
> >
> > Anup Patel wrote:
> > > On Wed, Jun 19, 2024 at 11:16 PM Emil Renner Berthing
> > > <emil.renner.berthing@...onical.com> wrote:
> > > >
> > > > Anup Patel wrote:
> > > > > On Tue, Jun 18, 2024 at 7:00 PM Emil Renner Berthing
> > > > > <emil.renner.berthing@...onical.com> wrote:
> > > > > >
> > > > > > Anup Patel wrote:
> > > > > > > The PLIC driver does not require very early initialization so convert
> > > > > > > it into a platform driver.
> > > > > > >
> > > > > > > After conversion, the PLIC driver is probed after CPUs are brought-up
> > > > > > > so setup cpuhp state after context handler of all online CPUs are
> > > > > > > initialized otherwise PLIC driver crashes for platforms with multiple
> > > > > > > PLIC instances.
> > > > > > >
> > > > > > > Signed-off-by: Anup Patel <apatel@...tanamicro.com>
> > > > > >
> > > > > > Hi Anup,
> > > > > >
> > > > > > Sorry for the late reply to the mailing list, but ever since 6.9 where this was
> > > > > > applied my Allwinner D1 based boards no longer boot. This is the log of my
> > > > > > LicheeRV Dock booting plain 6.10-rc4, locking up and then rebooting due to the
> > > > > > the watchdog timing out:
> > > > > >
> > > > > > https://pastebin.com/raw/nsbzgEKW
> > > > > >
> > > > > > On 6.10-rc4 I can bring the same board to boot by reverting this patch and all
> > > > > > patches building on it. Eg.:
> > > > > >
> > > > > >   git revert e306a894bd51 a7fb69ffd7ce abb720579490 \
> > > > > >              956521064780 a15587277a24 6c725f33d67b \
> > > > > >              b68d0ff529a9 25d862e183d4 8ec99b033147
> > > > >
> > > > > Does your board boot with only SBI timer driver enabled ?
> > > >
> > > > I'm not 100% sure this is what you mean, but with this change I can disable
> > > > CONFIG_SUN4I_TIMER:
> > > >
> > > > diff --git a/arch/riscv/Kconfig.socs b/arch/riscv/Kconfig.socs
> > > > index f51bb24bc84c..0143545348eb 100644
> > > > --- a/arch/riscv/Kconfig.socs
> > > > +++ b/arch/riscv/Kconfig.socs
> > > > @@ -39,7 +39,6 @@ config ARCH_SUNXI
> > > >         bool "Allwinner sun20i SoCs"
> > > >         depends on MMU && !XIP_KERNEL
> > > >         select ERRATA_THEAD
> > > > -       select SUN4I_TIMER
> > > >         help
> > > >           This enables support for Allwinner sun20i platform hardware,
> > > >           including boards based on the D1 and D1s SoCs.
> > > >
> > > >
> > > > But unfortunately the board still doesn't boot:
> > > > https://pastebin.com/raw/AwRxcfeu
> > >
> > > I think we should enable debug prints in DD core and see
> > > which device is not getting probed due to lack of a provider.
> > >
> > > Just add "#define DEBUG" at the top in drivers/base/core.c
> > > and boot again with "loglevel=8" kernel parameter (along with
> > > the above change).
> >
> > With the above changes this is what I get:
> > https://pastebin.com/raw/JfRrEahT
> 
> You should see prints like below which show producer consumer
> relation:
> 
> [    0.214589] /soc/rtc@...000 Linked as a fwnode consumer to /soc/plic@...0000
> [    0.214966] /soc/serial@...00000 Linked as a fwnode consumer to
> /soc/plic@...0000
> [    0.215443] /soc/virtio_mmio@...08000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [    0.216041] /soc/virtio_mmio@...07000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [    0.216482] /soc/virtio_mmio@...06000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [    0.216868] /soc/virtio_mmio@...05000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [    0.217477] /soc/virtio_mmio@...04000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [    0.217949] /soc/virtio_mmio@...03000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [    0.218595] /soc/virtio_mmio@...02000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [    0.219280] /soc/virtio_mmio@...01000 Linked as a fwnode consumer
> to /soc/plic@...0000
> [    0.219908] /soc/plic@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [    0.220800] /soc/plic@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [    0.221323] /soc/plic@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [    0.221838] /soc/plic@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [    0.222347] /soc/clint@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [    0.222769] /soc/clint@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [    0.223864] /soc/clint@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [    0.224370] /soc/clint@...0000 Linked as a fwnode consumer to
> /cpus/cpu@...nterrupt-controller
> [    0.225217] /soc/pci@...00000 Linked as a fwnode consumer to
> /soc/plic@...0000
> 
> To get further prints, I suggest enabling SBI_HVC console and use
> "console=hvc0" as kernel parameter.
> 
> Regards,
> Anup

I did some follow-up research into this. The hanging after "cpuidle:
using governor menu" is due to being stuck inside of
check_unaligned_access(). Specifically, there is a check that appears to
be waiting for jiffies to start ticking, but they never do:

while ((now = jiffies) == start_jiffies)
	cpu_relax();

`jiffies` is fixed at 0xfffedb08, effectively making this a while(true)
loop. This happens with and without SUN4I_TIMER.

This hang unfortunately happens before the "Linked as a fwnode consumer"
print statements start.

After bypassing this with the configs

CONFIG_NONPORTABLE=y
CONFIG_RISCV_EFFICIENT_UNALIGNED_ACCESS=y

A new warning is tripped:

[    1.015134] No max_rate, ignoring min_rate of clock 9 - pll-video0
[    1.021322] WARNING: CPU: 0 PID: 1 at drivers/clk/sunxi-ng/ccu_common.c:155 sunxi_ccu_probe+0x144/0x1a2
[    1.021351] Modules linked in:
[    1.021360] CPU: 0 PID: 1 Comm: swapper Tainted: G        W          6.10.0-rc6 #1
[    1.021372] Hardware name: Allwinner D1 Nezha (changed) (DT)
[    1.021377] epc : sunxi_ccu_probe+0x144/0x1a2
[    1.021386]  ra : sunxi_ccu_probe+0x144/0x1a2
[    1.021397] epc : ffffffff80405a50 ra : ffffffff80405a50 sp : ffffffc80000bb80
[    1.021406]  gp : ffffffff815f69c8 tp : ffffffd801df8000 t0 : 6100000000000000
[    1.021414]  t1 : 000000000000004e t2 : 61725f78616d206f s0 : ffffffc80000bbe0
[    1.021422]  s1 : ffffffff81537498 a0 : 0000000000000036 a1 : 000000000000054b
[    1.021430]  a2 : 00000000ffffefff a3 : 0000000000000000 a4 : ffffffff8141f628
[    1.021438]  a5 : 0000000000000000 a6 : 0000000000000000 a7 : 000000004442434e
[    1.021446]  s2 : 0000000000000009 s3 : 0000000000000000 s4 : ffffffd801dc9010
[    1.021453]  s5 : ffffffd802428a00 s6 : ffffffd83ffdcf20 s7 : ffffffc800015000
[    1.021462]  s8 : ffffffff80e55360 s9 : ffffffff81034598 s10: 0000000000000000
[    1.021470]  s11: 0000000000000000 t3 : ffffffff8160a257 t4 : ffffffff8160a257
[    1.021478]  t5 : ffffffff8160a258 t6 : ffffffc80000b990
[    1.021485] status: 0000000200000120 badaddr: 0000000000000000 cause: 0000000000000003
[    1.021493] [<ffffffff80405a50>] sunxi_ccu_probe+0x144/0x1a2
[    1.021510] [<ffffffff80405af6>] devm_sunxi_ccu_probe+0x48/0x82
[    1.021524] [<ffffffff80409020>] sun20i_d1_ccu_probe+0xba/0xfa
[    1.021546] [<ffffffff804a8b40>] platform_probe+0x4e/0xa6
[    1.021562] [<ffffffff808d81ee>] really_probe+0x10a/0x2dc
[    1.021581] [<ffffffff808d8472>] __driver_probe_device.part.0+0xb2/0xe8
[    1.021597] [<ffffffff804a67aa>] driver_probe_device+0x7a/0xca
[    1.021621] [<ffffffff804a6912>] __driver_attach+0x52/0x164
[    1.021638] [<ffffffff804a4c7a>] bus_for_each_dev+0x56/0x8c
[    1.021656] [<ffffffff804a6382>] driver_attach+0x1a/0x22
[    1.021673] [<ffffffff804a5c18>] bus_add_driver+0xea/0x1d8
[    1.021690] [<ffffffff804a7852>] driver_register+0x3e/0xd8
[    1.021709] [<ffffffff804a8826>] __platform_driver_register+0x1c/0x24
Emil[    1.021725] [<ffffffff80a17488>] sun20i_d1_ccu_driver_init+0x1a/0x22
[    1.021746] [<ffffffff800026ae>] do_one_initcall+0x46/0x1be
[    1.021762] [<ffffffff80a00ef2>] kernel_init_freeable+0x1c6/0x220
[    1.021791] [<ffffffff808e0b46>] kernel_init+0x1e/0x112
Linked as a fwnode consumer[    1.021807] [<ffffffff808e7632>] ret_from_fork+0xe/0x1c

The warning is not fatal, so execution continues until hanging at

[    2.110919] printk: legacy console [ttyS0] disabled
[    2.136911] 2500000.serial: ttyS0 at MMIO 0x2500000 (irq = 205, base_baud = 1500000) is a 16550A�[    2.145674] printk: legacy console [ttyS0] enabled
[    2.145674] printk: legacy console [ttyS0] enabled
[    2.155095] printk: legacy bootconsole [sbi0] disabled
[    2.155095] printk: legacy bootconsole [sbi0] disabled

I have not been able to discover why it hangs here.

The clock is somehow relying on the previous behavior of this PLIC
driver.

- Charlie

> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@...ts.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ