[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <jzmbuiqm5usjfklqs2cmxz72j5qjvttcib6grn5visgqe37qtx@cowi4mtcvwfh>
Date: Wed, 12 Nov 2025 06:39:08 -0800
From: Breno Leitao <leitao@...ian.org>
To: Vishwaroop A <va@...dia.com>
Cc: Mark Brown <broonie@...nel.org>,
Thierry Reding <thierry.reding@...il.com>, Jonathan Hunter <jonathanh@...dia.com>,
Sowjanya Komatineni <skomatineni@...dia.com>, Laxman Dewangan <ldewangan@...dia.com>, smangipudi@...dia.com,
kyarlagadda@...dia.com, linux-spi@...r.kernel.org, linux-tegra@...r.kernel.org,
linux-kernel@...r.kernel.org, Thierry Reding <treding@...dia.com>
Subject: Re: [PATCH v5 1/3] spi: tegra210-quad: Fix timeout handling
On Tue, Oct 28, 2025 at 03:57:01PM +0000, Vishwaroop A wrote:
> When the CPU that the QSPI interrupt handler runs on (typically CPU 0)
> is excessively busy, it can lead to rare cases of the IRQ thread not
> running before the transfer timeout is reached.
>
> While handling the timeouts, any pending transfers are cleaned up and
> the message that they correspond to is marked as failed, which leaves
> the curr_xfer field pointing at stale memory.
I saw something similar on one of my hosts, and I debugged it, and it
seemed similar to what you are fixing in here.
Just sharing what I got while debugging this, in case this is useful:
UBSAN: shift-out-of-bounds in drivers/spi/spi-tegra210-quad.c:385:25
shift exponent 198 is too large for 32-bit type 'u32' (aka 'unsigned int')
CPU: 0 UID: 0 PID: 883 Comm: irq/43-NVDA1513 Tainted: G W E N 6.16.1 #1 PREEMPT(none)
Tainted: [W]=WARN, [E]=UNSIGNED_MODULE, [N]=TEST
Hardware name: Quanta JAVA ISLAND PVT 29F0EMAZ049/Java Island, BIOS F0EJ3A14 09/02/2025
Call trace:
show_stack+0x1c/0x30 (C)
dump_stack_lvl+0x38/0xb0
dump_stack+0x14/0x1c
__ubsan_handle_shift_out_of_bounds+0x24c/0x2c0
tegra_qspi_isr_thread+0x1cc8/0x1e60 [spi_tegra210_quad]
irq_thread_fn+0x80/0x108
irq_thread+0x158/0x258
kthread+0x3fc/0x530
ret_from_fork+0x10/0x20
---[ end trace ]---
------------[ cut here ]------------
UBSAN: shift-out-of-bounds in drivers/spi/spi-tegra210-quad.c:397:20
shift exponent 32 is too large for 32-bit type 'u32' (aka 'unsigned int')
CPU: 0 UID: 0 PID: 883 Comm: irq/43-NVDA1513 Tainted: G W E N 6.16.1 #1 PREEMPT(none)
Tainted: [W]=WARN, [E]=UNSIGNED_MODULE, [N]=TEST
Hardware name: Quanta JAVA ISLAND PVT 29F0EMAZ049/Java Island, BIOS F0EJ3A14 09/02/2025
Call trace:
show_stack+0x1c/0x30 (C)
dump_stack_lvl+0x38/0xb0
dump_stack+0x14/0x1c
__ubsan_handle_shift_out_of_bounds+0x24c/0x2c0
tegra_qspi_isr_thread+0xc90/0x1e60 [spi_tegra210_quad]
irq_thread_fn+0x80/0x108
irq_thread+0x158/0x258
kthread+0x3fc/0x530
ret_from_fork+0x10/0x20
---[ end trace ]---
and then KASAN and a kernel crash.
BUG: KASAN: vmalloc-out-of-bounds in tegra_qspi_isr_thread+0xce8/0x1e60 [spi_tegra210_quad]
Write of size 1 at addr ffff8000db950000 by task irq/43-NVDA1513/883
CPU: 0 UID: 0 PID: 883 Comm: irq/43-NVDA1513 Tainted: G W E N 6.16.1-0_fbk0_debug_rc20_0_g977c20cb5846 #1 PREEMPT(none)
Tainted: [W]=WARN, [E]=UNSIGNED_MODULE, [N]=TEST
Hardware name: Quanta JAVA ISLAND PVT 29F0EMAZ049/Java Island, BIOS F0EJ3A14 09/02/2025
Call trace:
show_stack+0x1c/0x30 (C)
dump_stack_lvl+0x38/0xb0
print_report+0x164/0x6d8
kasan_report+0xcc/0x128
__asan_report_store1_noabort+0x1c/0x28
tegra_qspi_isr_thread+0xce8/0x1e60 [spi_tegra210_quad]
irq_thread_fn+0x80/0x108
irq_thread+0x158/0x258
kthread+0x3fc/0x530
ret_from_fork+0x10/0x20
The buggy address belongs to a 1-page vmalloc region starting at 0xffff8000db940000 allocated at copy_process+0x258/0x28d8
Memory state around the buggy address:
ffff8000db94ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff8000db94ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff8000db950000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
^
ffff8000db950080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
ffff8000db950100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
==================================================================
Unable to handle kernel paging request at virtual address ffff8000db950000
KASAN: probably user-memory-access in range [0x00000006dca80000-0x00000006dca80007]
Mem abort info:
ESR = 0x0000000096000047
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x07: level 3 translation fault
Data abort info:
ISV = 0, ISS = 0x00000047, ISS2 = 0x00000000
CM = 0, WnR = 1, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
pstate: 234010c9 (nzCv daIF +PAN -UAO +TCO +DIT +SSBS BTYPE=--)
pc : tegra_qspi_isr_thread+0xcc0/0x1e60 [spi_tegra210_quad]
lr : tegra_qspi_isr_thread+0xce8/0x1e60 [spi_tegra210_quad]
x26: 0000000000000001 x25: 0000000000000028 x24: ffff8000db94ffff
x23: ffff0000d16b0918 x22: 0000000000000040 x21: 000000000000003a
x20: ffff8000db94ffff x19: ffff0000d16b08c0 x18: 0000000000000001
x17: 3d3d3d3d3d3b2d2c x16: 3d3d3d3d3d3b2d2c x15: 0000000000000001
x14: 1ffff00010bfce80 x13: 0000000000000000 x12: 0000000000000000
x11: ffff700010bfce81 x10: 0000000000000019 x9 : 0000000000000000
x8 : 0000000000000000 x7 : 0000000000000001 x6 : 0000000000000001
x5 : ffff8000b49cf8e0 x4 : ffff800084e7b140 x3 : ffff8000801bbd8c
x2 : 0000000000000001 x1 : 0000000000000008 x0 : 0000000000000001
Call trace:
tegra_qspi_isr_thread+0xcc0/0x1e60 [spi_tegra210_quad] (P)
irq_thread_fn+0x80/0x108
irq_thread+0x158/0x258
kthread+0x3fc/0x530
ret_from_fork+0x10/0x20
Code: 540001aa 1ad92768 f85f83aa 6b1a039f (383a6b08)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops: Fatal exception
SMP: stopping secondary CPUs
Kernel Offset: disabled
CPU features: 0x2000,000003c0,62534ca1,5467fea7
Memory Limit: none
Powered by blists - more mailing lists