[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YftWeLCpAfN/JnFj@pendragon.ideasonboard.com>
Date: Thu, 3 Feb 2022 06:13:44 +0200
From: Laurent Pinchart <laurent.pinchart@...asonboard.com>
To: Neel Gandhi <neel.gandhi@...inx.com>
Cc: dan.j.williams@...el.com, vkoul@...nel.org,
michal.simek@...inx.com, dmaengine@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] dmaengine: xilinx: dpdma: Fix race condition in vsync IRQ
Hi Neel,
Thank you for the patch.
On Sat, Jan 22, 2022 at 05:44:07PM +0530, Neel Gandhi wrote:
> Protected race condition of virtual DMA channel from channel queue transfer
> via vchan spin lock from the caller of xilinx_dpdma_chan_queue_transfer
This should explain what the race condition is, as it's not immediately
apparent from the patch itself. You can write it as
The vchan_next_desc() function, called from
xilinx_dpdma_chan_queue_transfer(), must be called with
virt_dma_chan.lock held. This isn't correctly handled in all code paths,
resulting in a race condition between the .device_issue_pending()
handler and the IRQ handler which causes DMA to randomly stop. Fix it by
taking the lock around xilinx_dpdma_chan_queue_transfer() calls that are
missing it.
> Signed-off-by: Neel Gandhi <neel.gandhi@...inx.com>
> ---
> drivers/dma/xilinx/xilinx_dpdma.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/dma/xilinx/xilinx_dpdma.c b/drivers/dma/xilinx/xilinx_dpdma.c
> index b0f4948b00a5..7d77a8948e38 100644
> --- a/drivers/dma/xilinx/xilinx_dpdma.c
> +++ b/drivers/dma/xilinx/xilinx_dpdma.c
> @@ -1102,7 +1102,9 @@ static void xilinx_dpdma_chan_vsync_irq(struct xilinx_dpdma_chan *chan)
Adding some context:
if (chan->desc.active)
vchan_cookie_complete(&chan->desc.active->vdesc);
> chan->desc.active = pending;
> chan->desc.pending = NULL;
>
> + spin_lock(&chan->vchan.lock);
> xilinx_dpdma_chan_queue_transfer(chan);
> + spin_unlock(&chan->vchan.lock);
There seems to be one more race condition here. vchan_cookie_complete()
is documented as requiring the vchan.lock being held too. Moving the
spin_lock() call before that should be enough to fix it.
It would probably be useful to revisit locking in this driver, but for
now, with the issues pointed out here fixed, this patch should be good
enough. Can you submit a v2 ?
>
> out:
> spin_unlock_irqrestore(&chan->lock, flags);
> @@ -1495,7 +1497,9 @@ static void xilinx_dpdma_chan_err_task(struct tasklet_struct *t)
> XILINX_DPDMA_EINTR_CHAN_ERR_MASK << chan->id);
>
> spin_lock_irqsave(&chan->lock, flags);
> + spin_lock(&chan->vchan.lock);
> xilinx_dpdma_chan_queue_transfer(chan);
> + spin_unlock(&chan->vchan.lock);
> spin_unlock_irqrestore(&chan->lock, flags);
> }
>
--
Regards,
Laurent Pinchart
Powered by blists - more mailing lists