lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a861e306-4a46-4f26-a0c2-f6f657571d48@nvidia.com>
Date: Thu, 6 Nov 2025 10:06:34 +0000
From: Jon Hunter <jonathanh@...dia.com>
To: Vishwaroop A <va@...dia.com>, Mark Brown <broonie@...nel.org>,
 Thierry Reding <thierry.reding@...il.com>,
 Sowjanya Komatineni <skomatineni@...dia.com>,
 Laxman Dewangan <ldewangan@...dia.com>, smangipudi@...dia.com,
 kyarlagadda@...dia.com
Cc: linux-spi@...r.kernel.org, linux-tegra@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 0/3] spi: tegra210-quad: Improve timeout handling under
 high system load


On 28/10/2025 15:57, Vishwaroop A wrote:
> Hi,
> 
> This patch series addresses timeout handling issues in the Tegra QSPI driver
> that occur under high system load conditions. We've observed that when CPUs
> are saturated (due to error injection, RAS firmware activity, or general CPU
> contention), QSPI interrupt handlers can be delayed, causing spurious transfer
> failures even though the hardware completed the operation successfully.
> 
> Patch 1 fixes a stale pointer issue by ensuring curr_xfer is cleared on timeout
> and checked when the IRQ thread finally runs. It also ensures interrupts are
> properly cleared on failure paths.
> 
> Patch 2 refactors the timeout cleanup code into dedicated helper functions
> (tegra_qspi_reset, tegra_qspi_dma_stop, tegra_qspi_pio_stop) to improve code
> readability and maintainability. This is purely a code reorganization with no
> functional changes.
> 
> Patch 3 adds hardware status checking on timeout. Before failing a transfer,
> the driver now reads QSPI_TRANS_STATUS to verify if the hardware actually
> completed the operation. If so, it manually invokes the completion handler
> instead of failing the transfer. This distinguishes genuine hardware timeouts
> from delayed/lost interrupts.
> 
> These changes have been tested in production environments under various high
> load scenarios including RAS testing and CPU saturation workloads.


For the series ...

Tested-by: Jon Hunter <jonathanh@...dia.com>
Reviewed-by: Jon Hunter <jonathanh@...dia.com>

Thanks
Jon

-- 
nvpublic


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ