[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250318-psychedelic-thundering-guppy-22bba2@leitao>
Date: Tue, 18 Mar 2025 11:32:40 -0700
From: Breno Leitao <leitao@...ian.org>
To: Mark Brown <broonie@...ian.org>, arnd@...db.de
Cc: Thierry Reding <thierry.reding@...il.com>,
Jonathan Hunter <jonathanh@...dia.com>,
Sowjanya Komatineni <skomatineni@...dia.com>,
Laxman Dewangan <ldewangan@...dia.com>, linux-tegra@...r.kernel.org,
linux-spi@...r.kernel.org, linux-kernel@...r.kernel.org,
rmikey@...a.com, kernel-team@...a.com, gregkh@...uxfoundation.org,
noodles@...th.li, jarkko@...nel.org, peterhuewe@....de, jgg@...pe.c
Subject: Re: [PATCH 1/3] spi: tegra210-quad: use device_reset_optional()
instead of device_reset()
On Tue, Mar 18, 2025 at 11:29:26AM -0700, Breno Leitao wrote:
> On Tue, Mar 18, 2025 at 05:34:55PM +0000, Mark Brown wrote:
> > On Tue, Mar 18, 2025 at 10:02:47AM -0700, Breno Leitao wrote:
> >
> > > Makes sense. Another question, for platforms like this one that doesn't
> > > have the device reset methods, what can we do to stop the bleed?
> >
> > > Basically every message that is sent to the SPI controller will fail,
> > > which will trigger the device_reet() which is a no-op, but the device
> > > will continue to be online. Should we disable the device after some
> > > point?
> >
> > The SPI controller is only going to be doing something because some
> > driver for an attached SPI device is trying to do something. Presumably
> > whatever driver that is won't be having a good time and can hopefully
> > figure something out, though given that SPI is simple and not
> > hotpluggable this isn't really something that comes up a lot in
> > production so I'd be unsurprised to see things just keep on retrying.
> > I'd expect to see any substantial error handling in the driver for the
> > device rather than in the controller.
>
> Good point. In my specific case, this is coming from tpm_tis,
> which is not aware that the device is totally dead, and continues to ask
> for random numbers:
>
> tegra_qspi_transfer_one_message
> __spi_pump_transfer_message
> __spi_sync
> spi_sync
> tpm_tis_spi_transfer
> tpm_tis_spi_read_bytes
> tpm_tis_request_locality
> tpm_chip_start
> tpm_try_get_ops
> tpm_find_get_ops
> tpm_get_random
> tpm_hwrng_read
> hwrng_fillfn
> kthread
> ret_from_fork
>
> Looking at tpm_tis, it seems it doesn't care if the the SPI is dead, and
> just forward through the requests, which never complete. Adding Arnd to
> see if he has any idea about this.
>
> Arnd,
>
> Summary of the proiblem: tpm_tis is trying to read random numbers
> through a dead SPI controller. That causes infinite amounts of warnings
> on the kernel, given that the controller is WARNing on time outs (which
> is being fixed in one of the patches in this patchset).
>
> Question: Should tpm_tis be aware that the underneath SPI controller is
> dead, and eventually get unplugged?
Adding Arnd to the email.
Powered by blists - more mailing lists