lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <oz6f5mcxi7jxyubrd6dpdltusogv5ortbmll6rom5c2bja2x7o@brsqolpmp5x7>
Date: Wed, 8 Jan 2025 17:40:50 +0100
From: Thierry Reding <thierry.reding@...il.com>
To: Parker Newman <parker@...est.io>
Cc: Alexandre Torgue <alexandre.torgue@...s.st.com>, 
	Jose Abreu <joabreu@...opsys.com>, Andrew Lunn <andrew+netdev@...n.ch>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, 
	Jonathan Hunter <jonathanh@...dia.com>, Maxime Coquelin <mcoquelin.stm32@...il.com>, 
	netdev@...r.kernel.org, linux-tegra@...r.kernel.org, 
	linux-stm32@...md-mailman.stormreply.com, linux-arm-kernel@...ts.infradead.org, 
	linux-kernel@...r.kernel.org, Parker Newman <pnewman@...necttech.com>
Subject: Re: [PATCH net v2 1/1] net: stmmac: dwmac-tegra: Read iommu stream
 id from device tree

On Tue, Jan 07, 2025 at 04:24:59PM -0500, Parker Newman wrote:
> From: Parker Newman <pnewman@...necttech.com>
> 
> Nvidia's Tegra MGBE controllers require the IOMMU "Stream ID" (SID) to be
> written to the MGBE_WRAP_AXI_ASID0_CTRL register.
> 
> The current driver is hard coded to use MGBE0's SID for all controllers.
> This causes softirq time outs and kernel panics when using controllers
> other than MGBE0.
> 
> Example dmesg errors when an ethernet cable is connected to MGBE1:
> 
> [  116.133290] tegra-mgbe 6910000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx
> [  121.851283] tegra-mgbe 6910000.ethernet eth1: NETDEV WATCHDOG: CPU: 5: transmit queue 0 timed out 5690 ms
> [  121.851782] tegra-mgbe 6910000.ethernet eth1: Reset adapter.
> [  121.892464] tegra-mgbe 6910000.ethernet eth1: Register MEM_TYPE_PAGE_POOL RxQ-0
> [  121.905920] tegra-mgbe 6910000.ethernet eth1: PHY [stmmac-1:00] driver [Aquantia AQR113] (irq=171)
> [  121.907356] tegra-mgbe 6910000.ethernet eth1: Enabling Safety Features
> [  121.907578] tegra-mgbe 6910000.ethernet eth1: IEEE 1588-2008 Advanced Timestamp supported
> [  121.908399] tegra-mgbe 6910000.ethernet eth1: registered PTP clock
> [  121.908582] tegra-mgbe 6910000.ethernet eth1: configuring for phy/10gbase-r link mode
> [  125.961292] tegra-mgbe 6910000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx
> [  181.921198] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> [  181.921404] rcu: 	7-....: (1 GPs behind) idle=540c/1/0x4000000000000002 softirq=1748/1749 fqs=2337
> [  181.921684] rcu: 	(detected by 4, t=6002 jiffies, g=1357, q=1254 ncpus=8)
> [  181.921878] Sending NMI from CPU 4 to CPUs 7:
> [  181.921886] NMI backtrace for cpu 7
> [  181.922131] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.13.0-rc3+ #6
> [  181.922390] Hardware name: NVIDIA CTI Forge + Orin AGX/Jetson, BIOS 202402.1-Unknown 10/28/2024
> [  181.922658] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [  181.922847] pc : handle_softirqs+0x98/0x368
> [  181.922978] lr : __do_softirq+0x18/0x20
> [  181.923095] sp : ffff80008003bf50
> [  181.923189] x29: ffff80008003bf50 x28: 0000000000000008 x27: 0000000000000000
> [  181.923379] x26: ffffce78ea277000 x25: 0000000000000000 x24: 0000001c61befda0
> [  181.924486] x23: 0000000060400009 x22: ffffce78e99918bc x21: ffff80008018bd70
> [  181.925568] x20: ffffce78e8bb00d8 x19: ffff80008018bc20 x18: 0000000000000000
> [  181.926655] x17: ffff318ebe7d3000 x16: ffff800080038000 x15: 0000000000000000
> [  181.931455] x14: ffff000080816680 x13: ffff318ebe7d3000 x12: 000000003464d91d
> [  181.938628] x11: 0000000000000040 x10: ffff000080165a70 x9 : ffffce78e8bb0160
> [  181.945804] x8 : ffff8000827b3160 x7 : f9157b241586f343 x6 : eeb6502a01c81c74
> [  181.953068] x5 : a4acfcdd2e8096bb x4 : ffffce78ea277340 x3 : 00000000ffffd1e1
> [  181.960329] x2 : 0000000000000101 x1 : ffffce78ea277340 x0 : ffff318ebe7d3000
> [  181.967591] Call trace:
> [  181.970043]  handle_softirqs+0x98/0x368 (P)
> [  181.974240]  __do_softirq+0x18/0x20
> [  181.977743]  ____do_softirq+0x14/0x28
> [  181.981415]  call_on_irq_stack+0x24/0x30
> [  181.985180]  do_softirq_own_stack+0x20/0x30
> [  181.989379]  __irq_exit_rcu+0x114/0x140
> [  181.993142]  irq_exit_rcu+0x14/0x28
> [  181.996816]  el1_interrupt+0x44/0xb8
> [  182.000316]  el1h_64_irq_handler+0x14/0x20
> [  182.004343]  el1h_64_irq+0x80/0x88
> [  182.007755]  cpuidle_enter_state+0xc4/0x4a8 (P)
> [  182.012305]  cpuidle_enter+0x3c/0x58
> [  182.015980]  cpuidle_idle_call+0x128/0x1c0
> [  182.020005]  do_idle+0xe0/0xf0
> [  182.023155]  cpu_startup_entry+0x3c/0x48
> [  182.026917]  secondary_start_kernel+0xdc/0x120
> [  182.031379]  __secondary_switched+0x74/0x78
> [  212.971162] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 7-.... } 6103 jiffies s: 417 root: 0x80/.
> [  212.985935] rcu: blocking rcu_node structures (internal RCU debug):
> [  212.992758] Sending NMI from CPU 0 to CPUs 7:
> [  212.998539] NMI backtrace for cpu 7
> [  213.004304] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.13.0-rc3+ #6
> [  213.016116] Hardware name: NVIDIA CTI Forge + Orin AGX/Jetson, BIOS 202402.1-Unknown 10/28/2024
> [  213.030817] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [  213.040528] pc : handle_softirqs+0x98/0x368
> [  213.046563] lr : __do_softirq+0x18/0x20
> [  213.051293] sp : ffff80008003bf50
> [  213.055839] x29: ffff80008003bf50 x28: 0000000000000008 x27: 0000000000000000
> [  213.067304] x26: ffffce78ea277000 x25: 0000000000000000 x24: 0000001c61befda0
> [  213.077014] x23: 0000000060400009 x22: ffffce78e99918bc x21: ffff80008018bd70
> [  213.087339] x20: ffffce78e8bb00d8 x19: ffff80008018bc20 x18: 0000000000000000
> [  213.097313] x17: ffff318ebe7d3000 x16: ffff800080038000 x15: 0000000000000000
> [  213.107201] x14: ffff000080816680 x13: ffff318ebe7d3000 x12: 000000003464d91d
> [  213.116651] x11: 0000000000000040 x10: ffff000080165a70 x9 : ffffce78e8bb0160
> [  213.127500] x8 : ffff8000827b3160 x7 : 0a37b344852820af x6 : 3f049caedd1ff608
> [  213.138002] x5 : cff7cfdbfaf31291 x4 : ffffce78ea277340 x3 : 00000000ffffde04
> [  213.150428] x2 : 0000000000000101 x1 : ffffce78ea277340 x0 : ffff318ebe7d3000
> [  213.162063] Call trace:
> [  213.165494]  handle_softirqs+0x98/0x368 (P)
> [  213.171256]  __do_softirq+0x18/0x20
> [  213.177291]  ____do_softirq+0x14/0x28
> [  213.182017]  call_on_irq_stack+0x24/0x30
> [  213.186565]  do_softirq_own_stack+0x20/0x30
> [  213.191815]  __irq_exit_rcu+0x114/0x140
> [  213.196891]  irq_exit_rcu+0x14/0x28
> [  213.202401]  el1_interrupt+0x44/0xb8
> [  213.207741]  el1h_64_irq_handler+0x14/0x20
> [  213.213519]  el1h_64_irq+0x80/0x88
> [  213.217541]  cpuidle_enter_state+0xc4/0x4a8 (P)
> [  213.224364]  cpuidle_enter+0x3c/0x58
> [  213.228653]  cpuidle_idle_call+0x128/0x1c0
> [  213.233993]  do_idle+0xe0/0xf0
> [  213.237928]  cpu_startup_entry+0x3c/0x48
> [  213.243791]  secondary_start_kernel+0xdc/0x120
> [  213.249830]  __secondary_switched+0x74/0x78
> 
> This bug has existed since the dwmac-tegra driver was added in Dec 2022
> (See Fixes tag below for commit hash).
> 
> The Tegra234 SOC has 4 MGBE controllers, however Nvidia's Developer Kit
> only uses MGBE0 which is why the bug was not found previously. Connect Tech
> has many products that use 2 (or more) MGBE controllers.
> 
> The solution is to read the controller's SID from the existing "iommus"
> device tree property. The 2nd field of the "iommus" device tree property
> is the controller's SID.
> 
> Device tree snippet from tegra234.dtsi showing MGBE1's "iommus" property:
> 
> smmu_niso0: iommu@...00000 {
>         compatible = "nvidia,tegra234-smmu", "nvidia,smmu-500";
> ...
> }
> 
> /* MGBE1 */
> ethernet@...0000 {
> 	compatible = "nvidia,tegra234-mgbe";
> ...
> 	iommus = <&smmu_niso0 TEGRA234_SID_MGBE_VF1>;
> ...
> }
> 
> Nvidia's arm-smmu driver reads the "iommus" property and stores the SID in
> the MGBE device's "fwspec" struct. The dwmac-tegra driver can access the
> SID using the tegra_dev_iommu_get_stream_id() helper function found in
> linux/iommu.h.
> 
> Calling tegra_dev_iommu_get_stream_id() should not fail unless the "iommus"
> property is removed from the device tree or the IOMMU is disabled.
> 
> While the Tegra234 SOC technically supports bypassing the IOMMU, it is not
> supported by the current firmware, has not been tested and not recommended.
> More detailed discussion with Thierry Reding from Nvidia linked below.
> 
> Fixes: d8ca113724e7 ("net: stmmac: tegra: Add MGBE support")
> Link: https://lore.kernel.org/netdev/cover.1731685185.git.pnewman@connecttech.com
> Signed-off-by: Parker Newman <pnewman@...necttech.com>
> ---
> 
> Changes v2:
> - dropped cover letter
> - added more detail to commit message
> - rebased to latest netdev tree
> 
>  drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)

Acked-by: Thierry Reding <treding@...dia.com>

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ