[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <AAA7943B-B1F1-4389-AAC3-8621EC6E38B8@martin.sperl.org>
Date: Tue, 15 Jan 2019 18:39:27 +0100
From: kernel@...tin.sperl.org
To: Jon Hunter <jonathanh@...dia.com>
Cc: Mark Brown <broonie@...nel.org>,
linux-tegra <linux-tegra@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-spi@...r.kernel.org
Subject: Re: Regression: spi: core: avoid waking pump thread from spi_sync instead run teardown delayed
Hi John!
> On 15.01.2019, at 15:26, Jon Hunter <jonathanh@...dia.com> wrote:
>> Looks as if there is something missing in spi_stop_queue that
>> would wake the worker thread one last time without any delays
>> and finish the hw shutdown immediately - it runs as a delayed
>> task...
>>
>> One question: do you run any spi transfers in
>> your test case before suspend?
>
> No and before suspending I dumped some of the spi stats and I see no
> tranfers/messages at all ...
>
> Stats for spi1 ...
> Bytes: 0
> Errors: 0
> Messages: 0
> Transfers: 0
This...
>> /sys/class/spi_master/spi1/statistics/messages gives some
>> counters on the number of spi messages processed which
>> would give you an indication if that is happening.
>>
>> It could be as easy as adding right after the first lock
>> in spi_stop_queue:
>> kthread_mod_delayed_work(&ctlr->kworker,
>> &ctlr->pump_idle_teardown, 0);
>> (plus maybe a yield or similar to allow the worker to
>> quickly/reliably run on a single core machine)
>>
>> I hope that this initial guess helps.
>
> Unfortunately, the above did not help and the issue persists.
>
> Digging a bit deeper I see that now the 'ctlr->queue' is empty but
> 'ctlr->busy' flag is set and this is causing the 'could not stop message
> queue' error.
>
> It seems that __spi_pump_messages() is getting called several times
> during boot when registering the spi-flash, then after the spi-flash has
> been registered, about a 1 sec later spi_pump_idle_teardown() is called
> (as expected), but exits because 'ctlr->running' is true. However,
and this contradicts each other!
If there is a call to message pump, then we should process a message
and the counters should increase.
If those counters do not increase then there is something strange.
If we are called without anything to do then the pump should trigger a
tear down and stop.
> spi_pump_idle_teardown() is never called again and when we suspend we
> are stuck in the busy/running state. In this case should something be
> scheduling spi_pump_idle_teardown() again? Although even if it does I
> don't see where the busy flag would be cleared in this path?
>
I am wondering where busy/running would be set in the first place
if there are no counters...
Is it possible that the specific flash is not using the “normal”
spi_pump_message, but spi_controller_mem_ops operations?
Maybe we are missing the teardown in that driver or something is
changing flags there.
grepping a bit:
I see spi_mem_access_start calling spi_flush_queue, which is calling
pump_message, which - if there is no transfer - will trigger a delayed
tear-down, while it maybe should not be doing that.
Maybe spi_mem_access_end needs a call to spi_flush_queue as well?
Unfortunately this is theoretical work and quite a lot of guesswork
without an actual device available for testing...
Thanks,
Martin
Powered by blists - more mailing lists