linux-kernel - Re: [PATCH 1/3] spi: tegra210-quad: use device_reset_optional() instead of device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250318-psychedelic-thundering-guppy-22bba2@leitao>
Date: Tue, 18 Mar 2025 11:32:40 -0700
From: Breno Leitao <leitao@...ian.org>
To: Mark Brown <broonie@...ian.org>, arnd@...db.de
Cc: Thierry Reding <thierry.reding@...il.com>,
	Jonathan Hunter <jonathanh@...dia.com>,
	Sowjanya Komatineni <skomatineni@...dia.com>,
	Laxman Dewangan <ldewangan@...dia.com>, linux-tegra@...r.kernel.org,
	linux-spi@...r.kernel.org, linux-kernel@...r.kernel.org,
	rmikey@...a.com, kernel-team@...a.com, gregkh@...uxfoundation.org,
	noodles@...th.li, jarkko@...nel.org, peterhuewe@....de, jgg@...pe.c
Subject: Re: [PATCH 1/3] spi: tegra210-quad: use device_reset_optional()
 instead of device_reset()

On Tue, Mar 18, 2025 at 11:29:26AM -0700, Breno Leitao wrote:
> On Tue, Mar 18, 2025 at 05:34:55PM +0000, Mark Brown wrote:
> > On Tue, Mar 18, 2025 at 10:02:47AM -0700, Breno Leitao wrote:
> > 
> > > Makes sense. Another question, for platforms like this one that doesn't
> > > have the device reset methods, what can we do to stop the bleed?
> > 
> > > Basically every message that is sent to the SPI controller will fail,
> > > which will trigger the device_reet() which is a no-op, but the device
> > > will continue to be online. Should we disable the device after some
> > > point?
> > 
> > The SPI controller is only going to be doing something because some
> > driver for an attached SPI device is trying to do something.  Presumably
> > whatever driver that is won't be having a good time and can hopefully
> > figure something out, though given that SPI is simple and not
> > hotpluggable this isn't really something that comes up a lot in
> > production so I'd be unsurprised to see things just keep on retrying.
> > I'd expect to see any substantial error handling in the driver for the
> > device rather than in the controller.
> 
> Good point. In my specific case, this is coming from tpm_tis,
> which is not aware that the device is totally dead, and continues to ask
> for random numbers:
> 
>             tegra_qspi_transfer_one_message
>             __spi_pump_transfer_message
>             __spi_sync
>             spi_sync
>             tpm_tis_spi_transfer
>             tpm_tis_spi_read_bytes
>             tpm_tis_request_locality
>             tpm_chip_start
>             tpm_try_get_ops
>             tpm_find_get_ops
>             tpm_get_random
>             tpm_hwrng_read
>             hwrng_fillfn
>             kthread
>             ret_from_fork
> 
> Looking at tpm_tis, it seems it doesn't care if the the SPI is dead, and
> just forward through the requests, which never complete. Adding Arnd to
> see if he has any idea about this.
> 
> Arnd,
> 
> Summary of the proiblem: tpm_tis is trying to read random numbers
> through a dead SPI controller. That causes infinite amounts of warnings
> on the kernel, given that the controller is WARNing on time outs (which
> is being fixed in one of the patches in this patchset).
> 
> Question: Should tpm_tis be aware that the underneath SPI controller is
> dead, and eventually get unplugged?

Adding Arnd to the email.