lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cf751cf7-53a5-438b-9903-903bd8c39b23@ti.com>
Date: Wed, 5 Nov 2025 18:35:21 +0530
From: Siddharth Vadapalli <s-vadapalli@...com>
To: João Paulo Gonçalves
	<jpaulo.silvagoncalves@...il.com>, Nishanth Menon <nm@...com>, "Vignesh
 Raghavendra" <vigneshr@...com>, Kishon Vijay Abraham I <kishon@...com>,
	Swapnil Jakhade <sjakhade@...ence.com>
CC: Andrew Davis <afd@...com>, Francesco Dolcini <francesco@...cini.it>,
	João Paulo Gonçalves <joao.goncalves@...adex.com>,
	<linux-arm-kernel@...ts.infradead.org>, <linux-kernel@...r.kernel.org>,
	<s-vadapalli@...com>
Subject: Re: TI K3 AM69 Kernel Panic when PCIe Controller is Enabled

On 05/11/25 6:16 PM, João Paulo Gonçalves wrote:
> Hello,
> 
> I was testing PCIe upstream support on TI AM69 with a NVMe Samsung SSD
> to add upstream support for Toradex Aquila AM69 (sent upstream [1]) and
> noticed that when enabling the PCIe controller (CONFIG_PCI_J721E_HOST=y)
> on the arm64 defconfig, there is a kernel panic on nvme_pci_enable:
> 
> [    7.397838] pci 0000:00:00.0: PCI bridge to [bus 00]
> [    7.402799] pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
> [    7.408887] pci 0000:00:00.0:   bridge window [mem 0x00000000-0x000fffff]
> [    7.415675] pci 0000:00:00.0:   bridge window [mem 0x00000000-0x000fffff 64bit pref]
> [    7.423501] pci 0000:00:00.0: supports D1
> [    7.427509] pci 0000:00:00.0: PME# supported from D0 D1 D3hot
> [    7.435887] pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
> [    7.444083] pci 0000:01:00.0: [144d:a808] type 00 class 0x010802 PCIe Endpoint
> [    7.451449] pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x00003fff 64bit]
> [    7.458407] pci 0000:01:00.0: 15.752 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x2 link at 0000:00:00.0 (capable of 31.504 Gb/s with 8.0 GT/s PCIe x4 link)
> [    7.480637] pci 0000:01:00.0: ASPM: DT platform, enabling L0s-up L0s-dw L1 ASPM-L1.1 ASPM-L1.2 PCI-PM-L1.1 PCI-PM-L1.2
> 
> [    7.493685] pci 0000:01:00.0: ASPM: DT platform, enabling ClockPM
> [    7.500002] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
> [    7.506679] pci 0000:00:00.0: bridge window [mem 0x4000200000-0x40002fffff]: assigned
> [    7.514553] pci 0000:00:00.0: bridge window [mem 0x4000300000-0x40003fffff 64bit pref]: assigned
> [    7.514562] pci 0000:00:00.0: bridge window [io  0x1000-0x1fff]: assigned
> [    7.530134] pci 0000:01:00.0: BAR 0 [mem 0x4000200000-0x4000203fff 64bit]: assigned
>           Starting Connection service...
> [    7.537834] pci 0000:00:00.0: PCI bridge to [bus 01]
> [    7.547531] pci 0000:00:00.0:   bridge window [io  0x1000-0x1fff]
> [    7.553644] pci 0000:00:00.0:   bridge window [mem 0x4000200000-0x40002fffff]
> [    7.553684] audit: type=1334 audit(1748569869.624:11): prog-id=15 op=LOAD
> [    7.560814] pci 0000:00:00.0:   bridge window [mem 0x4000300000-0x40003fffff 64bit pref]
> [    7.575695] pci_bus 0000:00: resource 4 [io  0x0000-0xfffff]
> [    7.575702] pci_bus 0000:00: resource 5 [mem 0x4000101000-0x40ffffffff]
> [    7.587971] pci_bus 0000:01: resource 0 [io  0x1000-0x1fff]
> [    7.587976] pci_bus 0000:01: resource 1 [mem 0x4000200000-0x40002fffff]
> [    7.587979] pci_bus 0000:01: resource 2 [mem 0x4000300000-0x40003fffff 64bit pref]
> 
> [    7.612848] pcieport 0000:00:00.0: of_irq_parse_pci: failed with rc=-22
> [    7.619482] pcieport 0000:00:00.0: enabling device (0000 -> 0003)
> [    7.635343] pcieport 0000:00:00.0: PME: Signaling with IRQ 600
>           Starting Virtual Console Setup...
> [    7.649648] pcieport 0000:00:00.0: AER: enabled with IRQ 600
> [    7.660029] am65-cpsw-nuss c000000.ethernet eth1: PHY [gpio-0:04] driver [TI DP83867] (irq=379)
> [    7.660666] nvme 0000:01:00.0: of_irq_parse_pci: failed with rc=-22
> [    7.668777] j721e-pcie 2910000.pcie: host bridge /bus@...000/pcie@...0000 ranges:
> [    7.669791] am65-cpsw-nuss c000000.ethernet eth1: configuring for phy/sgmii link mode
> [    7.675751] nvme nvme0: pci function 0000:01:00.0
> [    7.682528] j721e-pcie 2910000.pcie:       IO 0x4100001000..0x4100100fff -> 0x0000001000
> [    7.690339] nvme 0000:01:00.0: enabling device (0000 -> 0002)
> [    7.695034] j721e-pcie 2910000.pcie:      MEM 0x4100101000..0x41ffffffff -> 0x0000101000
> [    7.703120] SError Interrupt on CPU7, code 0x00000000bf000000 -- SError
> [    7.703127] CPU: 7 UID: 0 PID: 70 Comm: kworker/u32:3 Tainted: G   M                6.18.0-rc1-00023-g417c89ae1522 #3 PREEMPT
> [    7.703133] Tainted: [M]=MACHINE_CHECK
> [    7.703135] Hardware name: Toradex Aquila AM69 V1.0 on Aquila Development Board (DT)
> [    7.703139] Workqueue: async async_run_entry_fn
> [    7.703155] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    7.703159] pc : nvme_pci_enable+0x54/0x598 [nvme]
> [    7.703178] lr : nvme_pci_enable+0x48/0x598 [nvme]
> [    7.703183] sp : ffff80008130bad0
> [    7.703184] x29: ffff80008130baf0 x28: 0000000000000000 x27: 0000000000000000
> [    7.703189] x26: ffff000800028028 x25: ffff000800e211c0 x24: ffff00080d242260
> [    7.703194] x23: ffffbfcbbb091150 x22: ffff0008081b50c8 x21: ffff0008081b5000
> [    7.703199] x20: ffff0008081b5000 x19: ffff00080d242000 x18: 0000000000000006
> [    7.703203] x17: 203e2d2066666630 x16: ffffbfcbeb0d1ca0 x15: 2e2e303030313030
> [    7.703207] x14: 303031347830204f x13: 3030303130303030 x12: 30307830203e2d20
> [    7.703211] x11: 6666663030313030 x10: 313478302e2e3030 x9 : 4920202020202020
> [    7.703216] x8 : 3a656963702e3030 x7 : 205d383235323836 x6 : 362e37202020205b
> [    7.703220] x5 : ffff00080834f000 x4 : 0000000000000006 x3 : ffff800092800000
> [    7.703224] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
> [    7.703229] Kernel panic - not syncing: Asynchronous SError Interrupt
> [    7.703232] CPU: 7 UID: 0 PID: 70 Comm: kworker/u32:3 Tainted: G   M                6.18.0-rc1-00023-g417c89ae1522 #3 PREEMPT
> [    7.703236] Tainted: [M]=MACHINE_CHECK
> [    7.703238] Hardware name: Toradex Aquila AM69 V1.0 on Aquila Development Board (DT)
> [    7.703239] Workqueue: async async_run_entry_fn
> [    7.703244] Call trace:
> [    7.703247]  show_stack+0x18/0x24 (C)
> [    7.703254]  dump_stack_lvl+0x34/0x8c
> [    7.703260]  dump_stack+0x18/0x24
> [    7.703263]  vpanic+0x324/0x368
> [    7.703270]  nmi_panic+0x0/0x64
> [    7.703274]  add_taint+0x0/0xbc
> [    7.703279]  arm64_serror_panic+0x70/0x80
> [    7.703282]  do_serror+0x3c/0x70
> [    7.703285]  el1h_64_error_handler+0x34/0x50
> [    7.703290]  el1h_64_error+0x6c/0x70
> [    7.703293]  nvme_pci_enable+0x54/0x598 [nvme] (P)
> [    7.703299]  nvme_probe+0x358/0x77c [nvme]
> [    7.703304]  local_pci_probe+0x40/0xa4
> [    7.703311]  pci_device_probe+0xc0/0x240
> [    7.703316]  really_probe+0xbc/0x29c
> [    7.703324]  __driver_probe_device+0x78/0x12c
> [    7.703329]  driver_probe_device+0xd8/0x15c
> [    7.703334]  __device_attach_driver+0xb8/0x134
> [    7.703339]  bus_for_each_drv+0x88/0xe8
> [    7.703344]  __device_attach_async_helper+0xb4/0xd8
> [    7.703349]  async_run_entry_fn+0x34/0xe0
> [    7.703353]  process_one_work+0x148/0x28c
> [    7.703360]  worker_thread+0x2d0/0x3d8
> [    7.703364]  kthread+0x12c/0x204
> [    7.703369]  ret_from_fork+0x10/0x20
> [    7.703374] SMP: stopping secondary CPUs
> [    7.708837] Kernel Offset: 0x3fcb6aa00000 from 0xffff800080000000
> [    7.708840] PHYS_OFFSET: 0xfff1000080000000
> [    7.708842] CPU features: 0x080000,04025000,40004001,0400421b
> [    7.708845] Memory Limit: none
> [    7.996038] ---[ end Kernel panic - not syncing: Asynchronous SError Interrupt ]---
> 
> 
> The issue seems to be related to the Serdes PHY
> (CONFIG_PHY_CADENCE_TORRENT and CONFIG_PHY_J721E_WIZ) being enabled as a
> module on the arm64 defconfig. Enabling these configs as built-in, at
> least on my system, makes the PCIe work.
> 
> Toradex already reported this to TI for the downstream kernel [2], but
> it seems that it is also happening on upstream.
> 
> [1] https://lore.kernel.org/lkml/20251104144915.60445-1-francesco@dolcini.it/
> [2] https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1406797/am69-nvme-pcie-and-kernel-panic

The E2E thread above leads to another one where the issue was claimed to 
be seen only with the usage of an external reference clock, and it was 
fixed with the usage of the internal reference clock. Does this hold 
true for the board that you are using as well?

Regards,
Siddharth.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ