linux-kernel - Re: [PATCH] dmaengine: ti: k3-udma: Avoid false error msg on chan teardown

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4938f187-21f6-a97b-1a9d-e191353f1b5e@gmail.com>
Date:   Tue, 1 Mar 2022 21:31:13 +0200
From:   Péter Ujfalusi <peter.ujfalusi@...il.com>
To:     Vignesh Raghavendra <vigneshr@...com>,
        Vinod Koul <vkoul@...nel.org>
Cc:     dmaengine@...r.kernel.org, linux-kernel@...r.kernel.org,
        Linux ARM Mailing List <linux-arm-kernel@...ts.infradead.org>,
        Jayesh Choudhary <j-choudhary@...com>
Subject: Re: [PATCH] dmaengine: ti: k3-udma: Avoid false error msg on chan
 teardown

Hi Vignesh,

On 28/02/2022 13:18, Vignesh Raghavendra wrote:
> 
> 
> On 28/02/22 2:52 pm, Vignesh Raghavendra wrote:
>> Hi Peter,
>>
>> On 21/02/22 1:42 am, Péter Ujfalusi wrote:
>>> Hi Vignesh,
>>>
>>> On 15/02/2022 06:41, Vignesh Raghavendra wrote:
>>>> In cyclic mode, there is no additional descriptor pushed to collect
>>>> outstanding data on channel teardown. Therefore no need to wait for this
>>>> descriptor to come back.
>>>>
>>>> Without this terminating aplay cmd outputs false error msg like:
>>>> [  116.402800] ti-bcdma 485c0100.dma-controller: chan1 teardown timeout!
>>>
>>> are you sure it is aplay? It is MEM_TO_DEV, we only use the flush
>>> descriptor for DEV_TO_MEM. MEM_TO_DEV can 'disconnect' from the
>>> peripheral to flush out the FIFO.
>>>
>>
>> Yes, this is with aplay. You are right that MEM_TO_DEV should have
>> worked w/o this patch.
>>
>>
>>> I have not seen this on am654, j721e. I can not recall seeing this on
>>> the capture side either.
>>>
>>
>> I dont see it either
>>
>>> The cyclic TR should be able to drain the DEV_TO_MEM by itself and the
>>> TR should terminate.
>>>
>>
>> You are right. There seems to be a trobule with McASP + BCDMA on AM62
>> which needs more investigation. I see
>>
>>  RT c0000000 peer RT 90000000
>>  BCNT 5dc00, peer BCNT 46400

In case of MEM_TO_DEV stop we set the peer DMA (PDMA) to flush and set
the UDMA/BCDMA/PKTDMA to tdown.
If the flush is set for the PDMA, it will (should) disconnect it's
trigger from the peripheral and 'free run' to flush all data.

Afaik the PDMA on am62 is the same as it is on j721e, no?

>> So there is some data stuck in pipe which prevents channel from
>> disabling and TDCM being signaled. My guess is McASP is no longer
>> requesting more data from PDMA. Any way to look at McASP FIFO state/ DMA
>> req enable state? Wondering what else can prevent draining of data.
>>
>> One difference is that AM62 has ti,tlv320aic3106 codec (codec is the
>> master) where J7 uses PCM.
>>
> 
> I see couple of issues with DMA usage by McASP/sound:
> 
> McASP TX FIFO events are disabled first and then DMA channel is stopped.
> This does not work for K3 SoCs as some data remains stuck in DMA pipe
> and channel never goes to disable state.

You can not really stop the DMA first because the McASP would undeflow
right away. The McASP FIFO should be in bypass mode for K3 devices, the
PDMA can handle the feeding of McASP just fine.

One of the reasons to have the FLUSH on the peer (PDMA) side is exactly
this: to be able to drain the MEM_TO_DEV DMA FIFO to /dev/null even if
the peripheral is long gone (disabled, even powered down).

> I see .stop_dma_first flag in snd_soc_dai_link to force DMA to be
> stopped first, but I am not quite familiar on where to set this flag?

The stop_dma_first might work, but it is actually added to support pxa
(I think?) where they have two separate DMAs on two side of an external
FIFO. The DMAengine side need to be stopped first and then they have
open coded DMA code to busy loop to wait for the other DMA (non
DMAengine) to drain out the data.

> Even so, snd_dmaengine_pcm_trigger() calls dmaengine_terminate_async()
> and does not call dmaengine_synchronize() before disabling McASP TX, so
> channel teardown would still be unsuccessful.

We can not do a dmaengine_synchronize() in pcm.trigger as it is in
atomic context and we can not sleep.

> Alternately, we could reduce dev_warn() in udma_synchronize() to
> dev_dbg() as channel is still recoverable via  udma_reset_chan() which
> is done immediately after.
> There is a further dev_warn() message to indicate if channel refused to
> stop even after a reset?

The problem is that it is also possible that after a forced shutdown the
channel is not going to work anymore (we have this issue with am65, if I
recall right).

I would consult with the hardware team to understand what is going on,
make sure that the McASP AFIFO is disabled (evnums are 0).

If it is really something in the hardware that behaves differently in
am62 then add a quirk to handle it and implement a workaround.

> 
> Regards
> Vignesh
> 
>> Regards
>> Vignesh
>>
>>
>>>
>>>> Signed-off-by: Vignesh Raghavendra <vigneshr@...com>
>>>> ---
>>>>  drivers/dma/ti/k3-udma.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
>>>> index 9abb08d353ca0..c9a1b2f312603 100644
>>>> --- a/drivers/dma/ti/k3-udma.c
>>>> +++ b/drivers/dma/ti/k3-udma.c
>>>> @@ -3924,7 +3924,7 @@ static void udma_synchronize(struct dma_chan *chan)
>>>>  
>>>>  	vchan_synchronize(&uc->vc);
>>>>  
>>>> -	if (uc->state == UDMA_CHAN_IS_TERMINATING) {
>>>> +	if (uc->state == UDMA_CHAN_IS_TERMINATING && !uc->cyclic) {
>>>>  		timeout = wait_for_completion_timeout(&uc->teardown_completed,
>>>>  						      timeout);
>>>>  		if (!timeout) {
>>>

-- 
Péter