[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <521CC383-1118-480F-BC3B-B0E12F9F4CFC@goldelico.com>
Date: Fri, 3 Jan 2020 19:29:03 +0100
From: "H. Nikolaus Schaller" <hns@...delico.com>
To: Aaro Koskinen <aaro.koskinen@....fi>
Cc: Peter Ujfalusi <peter.ujfalusi@...com>,
Tony Lindgren <tony@...mide.com>,
Linux-OMAP <linux-omap@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Discussions about the Letux Kernel
<letux-kernel@...nphoenux.org>
Subject: Re: [BISECTED, REGRESSION] OMAP3 onenand/DMA broken
Hi Aaro,
> Am 03.01.2020 um 18:23 schrieb Aaro Koskinen <aaro.koskinen@....fi>:
>
> Hi,
>
> On Fri, Jan 03, 2020 at 09:46:58AM +0100, H. Nikolaus Schaller wrote:
>>> Am 03.01.2020 um 09:17 schrieb Aaro Koskinen <aaro.koskinen@....fi>:
>>> When booting v5.4 (or v5.5-rc4) on N900, the console gets flooded with:
>>>
>>> [ 8.335754] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [ 8.365753] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [ 8.395751] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [ 8.425750] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [ 8.455749] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [ 8.485748] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [ 8.515777] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [ 8.545776] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [ 8.575775] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>>
>>> making the system unusable.
>>
>> I can confirm that this issue exists but so far we failed to bisect
>> and make a proper report.
>>
>> Sometimes the system boots fine and sometimes it fails.
Well, we boot from µSD and the number of the timeouts changes. So it may
be a race or depend on driver load sequence if we come to a login: or not.
But this is not the real bug.
>>
>> It happens on omap3-gta04a5one.dts only, but not with omap3-gta04a4.dts
>> (both dm3730 but different NAND).
>
> I tried three different boards (N810, N900 and N950) and it always
> fails reliably.
The big question is why the patch is harmful.
I tried to understand what the patch is doing (without any knowledge
about the DMA hard- or software architecture).
Basically it reorders error handling and some corner cases.
Maybe it handles one differently that happens only for OneNAND.
What did jump to my mind is that before the patch there is an
unconditional call to omap_dma_chan_read(c, CCR) if (!c->paused && c->running)
And then DMA_COMPLETE is returned or ret if txstate == 0
With the new code the check for DMA_COMPLETE comes first and
directly leads to a return. Independently of txstate.
So if we have (!c->paused && c->running) and dma_cookie_status()
returns DMA_COMPLETE, there is no longer a call to omap_dma_chan_read()
Since I do not understand what omap_dma_chan_read() is doing,
and if (!c->paused && c->running) is relevant here,
I can not conclude if that is harmful.
But I can imagine that reading a register may have a side-effect of
resetting some bit like interrupt status registers.
I hope that Peter or Tony can respond soon.
BR and thanks,
Nikolaus
Powered by blists - more mailing lists