linux-kernel - Re: PROBLEM: deadlock when imx-sdma logs "restart cyclic channel"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <AM0PR08MB3089D5A1D1F017E700CC3DB580E9A@AM0PR08MB3089.eurprd08.prod.outlook.com>
Date:   Mon, 4 Sep 2023 10:08:53 +0000
From:   Tim van der Staaij | Zign <Tim.vanderstaaij@...ngroup.com>
To:     Fabio Estevam <festevam@...il.com>
CC:     Shawn Guo <shawnguo@...nel.org>,
        Sascha Hauer <s.hauer@...gutronix.de>,
        Vinod Koul <vkoul@...nel.org>,
        Pengutronix Kernel Team <kernel@...gutronix.de>,
        "dmaengine@...r.kernel.org" <dmaengine@...r.kernel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: PROBLEM: deadlock when imx-sdma logs "restart cyclic channel"

Hi Fabio,

Fabio Estevam <festevam@...il.com> wrote:
>5.19 is not a supported kernel version. Please test it with the latest 6.5-rc1.

Apologies, I was supposed to be using the 5.15 LTS kernel but due to a misconfiguration it got mixed up with a 5.19 image.

Unfortunately the circumstances for this bug are difficult to reproduce and I cannot do so at the moment.

Please disregard the bug report for now, and let me restate the issue as a theoretical question about the locking behavior of the driver.

Consider the following functions in drivers/dma/imx-sdma.c in the latest master revision:

static irqreturn_t sdma_int_handler(int irq, void *dev_id)
{
	struct sdma_engine *sdma = dev_id;
	unsigned long stat;

	stat = readl_relaxed(sdma->regs + SDMA_H_INTR);
	writel_relaxed(stat, sdma->regs + SDMA_H_INTR);
	/* channel 0 is special and not handled here, see run_channel0() */
	stat &= ~1;

	while (stat) {
		int channel = fls(stat) - 1;
		struct sdma_channel *sdmac = &sdma->channel[channel];
		struct sdma_desc *desc;

		spin_lock(&sdmac->vc.lock);
		desc = sdmac->desc;
		if (desc) {
			if (sdmac->flags & IMX_DMA_SG_LOOP) {
				if (sdmac->peripheral_type != IMX_DMATYPE_HDMI)
					sdma_update_channel_loop(sdmac);
				else
					vchan_cyclic_callback(&desc->vd);
			} else {
				mxc_sdma_handle_channel_normal(sdmac);
				vchan_cookie_complete(&desc->vd);
				sdma_start_desc(sdmac);
			}
		}

		spin_unlock(&sdmac->vc.lock);
		__clear_bit(channel, &stat);
	}

	return IRQ_HANDLED;
}

static void sdma_update_channel_loop(struct sdma_channel *sdmac)
{
	struct sdma_buffer_descriptor *bd;
	int error = 0;
	enum dma_status	old_status = sdmac->status;

	/* redacted for conciseness */

	/*
	 * SDMA stops cyclic channel when DMA request triggers a channel and no SDMA
	 * owned buffer is available (i.e. BD_DONE was set too late).
	 */
	if (sdmac->desc && !is_sdma_channel_enabled(sdmac->sdma, sdmac->channel)) {
		dev_warn(sdmac->sdma->dev, "restart cyclic channel %d\n", sdmac->channel);
		sdma_enable_channel(sdmac->sdma, sdmac->channel);
	}
}

As I understand it, sdma_update_channel_loop potentially calls dev_warn while sdmac->vc.lock is still held by sdma_int_handler. Then, dev_warn will also try to claim this lock somewhere down the call stack to safely access the device details. Is this a recipe for deadlock as I suspect, or am I overlooking something?

Kind regards,
Tim