[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <n6grskuq722vnogwp5obiwzv4pxs5bbqddadesffezhvba5cjh@d6shcrvpxujg>
Date: Fri, 23 Aug 2024 12:48:50 +0300
From: Serge Semin <fancer.lancer@...il.com>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jiri Slaby <jirislaby@...nel.org>, Andy Shevchenko <andy@...nel.org>,
Viresh Kumar <vireshk@...nel.org>, Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Vinod Koul <vkoul@...nel.org>
Cc: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
dmaengine@...r.kernel.org, linux-serial@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] dmaengine: dw: Prevent tx-status calling desc
callback (Fix UART deadlock!)
Hi folks
Any comments or suggestion about the change? The kernel occasionally
_deadlocks_ without it for the DW UART + DW DMAC hardware setup.
-Serge(y)
On Fri, Aug 02, 2024 at 11:07:51AM +0300, Serge Semin wrote:
> The dmaengine_tx_status() method updating the DMA-descriptors state and
> eventually calling the Tx-descriptors callback may potentially cause
> problems. In particular the deadlock was discovered in DW UART 8250 device
> interacting with DW DMA controller channels. The call-trace causing the
> deadlock is:
>
> serial8250_handle_irq()
> uart_port_lock_irqsave(port); ----------------------+
> handle_rx_dma() |
> serial8250_rx_dma_flush() |
> __dma_rx_complete() |
> dmaengine_tx_status() |
> dwc_scan_descriptors() |
> dwc_complete_all() |
> dwc_descriptor_complete() |
> dmaengine_desc_callback_invoke() |
> cb->callback(cb->callback_param); |
> || |
> dma_rx_complete(); |
> uart_port_lock_irqsave(port); ----+ <- Deadlock!
>
> So in case if the DMA-engine finished working at some point before the
> serial8250_rx_dma_flush() invocation and the respective tasklet hasn't
> been executed yet, then calling the dmaengine_tx_status() will cause the
> DMA-descriptors status update and the Tx-descriptor callback invocation.
> Generalizing the case up: if the dmaengine_tx_status() method callee and
> the Tx-descriptor callback refer to the related critical section, then
> calling dmaengine_tx_status() from the Tx-descriptor callback will
> inevitably cause a deadlock around the guarding lock as it happens in the
> Serial 8250 DMA implementation above. (Note the deadlock doesn't happen
> very often, but can be eventually discovered if the being received data
> size is greater than the Rx DMA-buffer size defined in the 8250_dma.c
> driver. In my case reducing the Rx DMA-buffer size increased the deadlock
> probability.)
>
> The easiest way to fix the deadlock was to just remove the Tx-descriptors
> state update from the DW DMA-engine Tx-descriptor status method
> implementation, as the most of the DMA-engine drivers imply. After this
> fix is applied the Tx-descriptors status will be only updated in the
> framework of the dwc_scan_descriptors() method called from the tasklet
> handling the deferred DMA-controller IRQ.
>
> Signed-off-by: Serge Semin <fancer.lancer@...il.com
>
> ---
>
> Note I have doubts whether it's the best possible solution of the problem
> since the client-driver deadlock is resolved here by fixing the DMA-engine
> provider code. But I failed to find any reasonable solution in the 8250
> DMA implementation.
>
> Moreover the suggested fix cause a weird outcome - under the high-speed
> and heavy serial transfers the next error is printed to the log sometimes:
>
> > dma dma0chan0: BUG: XFER bit set, but channel not idle!
>
> That's why the patch submitted as RFC. Do you have any better idea in mind
> to prevent the nested lock?
>
> Cc: Viresh Kumar <vireshk@...nel.org>
> Cc: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
> CC: Andy Shevchenko <andy@...nel.org>
> Cc: Vinod Koul <vkoul@...nel.org>
> Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> Cc: Jiri Slaby <jirislaby@...nel.org>
> Cc: "Ilpo Järvinen" <ilpo.jarvinen@...ux.intel.com>
> Cc: dmaengine@...r.kernel.org
> Cc: linux-serial@...r.kernel.org
> Cc: linux-kernel@...r.kernel.org
> ---
> drivers/dma/dw/core.c | 6 ------
> 1 file changed, 6 deletions(-)
>
> diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
> index 5f7d690e3dba..4b3402156eae 100644
> --- a/drivers/dma/dw/core.c
> +++ b/drivers/dma/dw/core.c
> @@ -925,12 +925,6 @@ dwc_tx_status(struct dma_chan *chan,
> struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
> enum dma_status ret;
>
> - ret = dma_cookie_status(chan, cookie, txstate);
> - if (ret == DMA_COMPLETE)
> - return ret;
> -
> - dwc_scan_descriptors(to_dw_dma(chan->device), dwc);
> -
> ret = dma_cookie_status(chan, cookie, txstate);
> if (ret == DMA_COMPLETE)
> return ret;
> --
> 2.43.0
>
Powered by blists - more mailing lists