linux-kernel - Re: [PATCH] dmaengine: xilinx: dpdma: Fix race condition in vsync IRQ

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <YftWeLCpAfN/JnFj@pendragon.ideasonboard.com>
Date:   Thu, 3 Feb 2022 06:13:44 +0200
From:   Laurent Pinchart <laurent.pinchart@...asonboard.com>
To:     Neel Gandhi <neel.gandhi@...inx.com>
Cc:     dan.j.williams@...el.com, vkoul@...nel.org,
        michal.simek@...inx.com, dmaengine@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] dmaengine: xilinx: dpdma: Fix race condition in vsync IRQ

Hi Neel,

Thank you for the patch.

On Sat, Jan 22, 2022 at 05:44:07PM +0530, Neel Gandhi wrote:
> Protected race condition of virtual DMA channel from channel queue transfer
> via vchan spin lock from the caller of xilinx_dpdma_chan_queue_transfer

This should explain what the race condition is, as it's not immediately
apparent from the patch itself. You can write it as

The vchan_next_desc() function, called from
xilinx_dpdma_chan_queue_transfer(), must be called with
virt_dma_chan.lock held. This isn't correctly handled in all code paths,
resulting in a race condition between the .device_issue_pending()
handler and the IRQ handler which causes DMA to randomly stop. Fix it by
taking the lock around xilinx_dpdma_chan_queue_transfer() calls that are
missing it.

> Signed-off-by: Neel Gandhi <neel.gandhi@...inx.com>
> ---
>  drivers/dma/xilinx/xilinx_dpdma.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/dma/xilinx/xilinx_dpdma.c b/drivers/dma/xilinx/xilinx_dpdma.c
> index b0f4948b00a5..7d77a8948e38 100644
> --- a/drivers/dma/xilinx/xilinx_dpdma.c
> +++ b/drivers/dma/xilinx/xilinx_dpdma.c
> @@ -1102,7 +1102,9 @@ static void xilinx_dpdma_chan_vsync_irq(struct  xilinx_dpdma_chan *chan)

Adding some context:

	if (chan->desc.active)
		vchan_cookie_complete(&chan->desc.active->vdesc);

>         chan->desc.active = pending;
>         chan->desc.pending = NULL;
> 
> +       spin_lock(&chan->vchan.lock);
>         xilinx_dpdma_chan_queue_transfer(chan);
> +       spin_unlock(&chan->vchan.lock);

There seems to be one more race condition here. vchan_cookie_complete()
is documented as requiring the vchan.lock being held too. Moving the
spin_lock() call before that should be enough to fix it.

It would probably be useful to revisit locking in this driver, but for
now, with the issues pointed out here fixed, this patch should be good
enough. Can you submit a v2 ?

> 
>  out:
>         spin_unlock_irqrestore(&chan->lock, flags);
> @@ -1495,7 +1497,9 @@ static void xilinx_dpdma_chan_err_task(struct tasklet_struct *t)
>                     XILINX_DPDMA_EINTR_CHAN_ERR_MASK << chan->id);
> 
>         spin_lock_irqsave(&chan->lock, flags);
> +       spin_lock(&chan->vchan.lock);
>         xilinx_dpdma_chan_queue_transfer(chan);
> +       spin_unlock(&chan->vchan.lock);
>         spin_unlock_irqrestore(&chan->lock, flags);
>  }
> 

-- 
Regards,

Laurent Pinchart