[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5zz3ojgkk6uggx5moglrwn44g7vdnraa4skqkxt4k6pjvta4lh@oproxefudzrt>
Date: Fri, 25 Apr 2025 17:05:46 +0530
From: Jai Luthra <jai.luthra@...ux.dev>
To: Rishikesh Donadkar <r-donadkar@...com>
Cc: mripard@...nel.org, linux-kernel@...r.kernel.org,
linux-media@...r.kernel.org, devicetree@...r.kernel.org, devarsht@...com,
y-abhilashchandra@...com, mchehab@...nel.org, robh@...nel.org, krzk+dt@...nel.org,
conor+dt@...nel.org, vaishnav.a@...com, s-jain1@...com, vigneshr@...com,
sakari.ailus@...ux.intel.com, hverkuil-cisco@...all.nl, tomi.valkeinen@...asonboard.com,
jai.luthra@...asonboard.com, changhuang.liang@...rfivetech.com, jack.zhu@...rfivetech.com,
laurent.pinchart@...asonboard.com
Subject: Re: [PATCH v3 13/13] media: ti: j721e-csi2rx: Change the drain
architecture for multistream
Hi Rishikesh,
Thanks for the patch.
On Thu, Apr 17, 2025 at 12:25:54PM +0530, Rishikesh Donadkar wrote:
> In Multistream use cases, we may face buffer starvation due to various
> reasons like slow downstream element in gstreamer pipeline. In these
> scenarios we need to make sure that the data corresponding to the slow
> pipeline is pulled out of the shared HW FIFO in CSI2RX IP to prevent
> other streams to get stalled due to FIFO overflow.
>
> Previously, in case of buffer starvation, dma was marked IDLE and the
> next buffer_queue() would drain the data before marking new buffer ready
> for DMA transaction. Here the driver waits for the next VIDIOC_QBUF
> ioctl callback to drain the stale data from the HW FIFO, if there is a
> delay in this callback being called, HW FIFO will overflow leading all
> the other camera pipelines in the media graph to hang.
The above description is a little hard to follow, and not correct. A "late"
QBUF callback is what buffer starvation *is*. With single-stream scenarios the
older drain architecture worked fine, as the goal there was to drain out stale
data in the FIFOs when buffers are available again.
>
> Introduce a new architecture where, CSI data is always pulled out of
> the shared HW FIFO irrespective of the availability of buffers from
> userspace.
I think this description does not make it very clear why multiple streams
(VCs) cause the older drain architecture to not work, and what exactly is the
limitation in the hardware FIFOs.
How about the following instead:
On buffer starvation the DMA is marked IDLE, and the stale data in the
internal FIFOs gets drained only on the next VIDIOC_QBUF call from the
userspace. This approach works fine for a single stream case.
But in multistream scenarios, buffer starvation for one stream i.e. one
virtual channel, can block the shared HW FIFO of the CSI2RX IP. This can
stall the pipeline for all other virtual channels, even if buffers are
available for them.
This patch introduces a new architecture, that continuously drains data
from the shared HW FIFO into a small (32KiB) buffer if no buffers are made
available to the driver from the userspace. This ensures independence
between different streams, where a slower downstream element for one
camera does not block streaming for other cameras.
>
> Signed-off-by: Rishikesh Donadkar <r-donadkar@...com>
> ---
> .../platform/ti/j721e-csi2rx/j721e-csi2rx.c | 96 +++++++------------
> 1 file changed, 33 insertions(+), 63 deletions(-)
>
> diff --git a/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c b/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c
> index 23d63d8bcd36a..7f476c78c4a92 100644
> --- a/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c
> +++ b/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c
> @@ -57,7 +57,6 @@
> #define TI_CSI2RX_MAX_SOURCE_PADS TI_CSI2RX_MAX_CTX
> #define TI_CSI2RX_MAX_PADS (1 + TI_CSI2RX_MAX_SOURCE_PADS)
>
> -#define DRAIN_TIMEOUT_MS 50
> #define DRAIN_BUFFER_SIZE SZ_32K
As you are draining 32KiB chunks, it is likely that the hardware is in the
middle of a frame when the userspace queues a new buffer and you stop
draining, and change the target address to a user-facing buffer. How does the
PSIL/DMA engine handle the SoF/EoF boundaries in such a case?
Would it be possible for you to verify if the first user-facing buffer after
draining has valid data? And if not, please highlight it so it may get fixed
in the future.
>
> struct ti_csi2rx_fmt {
> @@ -77,7 +76,6 @@ struct ti_csi2rx_buffer {
>
> enum ti_csi2rx_dma_state {
> TI_CSI2RX_DMA_STOPPED, /* Streaming not started yet. */
> - TI_CSI2RX_DMA_IDLE, /* Streaming but no pending DMA operation. */
> TI_CSI2RX_DMA_ACTIVE, /* Streaming and pending DMA operation. */
> };
>
> @@ -238,6 +236,10 @@ static const struct ti_csi2rx_fmt ti_csi2rx_formats[] = {
> static int ti_csi2rx_start_dma(struct ti_csi2rx_ctx *ctx,
> struct ti_csi2rx_buffer *buf);
>
> +/* Forward declarations needed by ti_csi2rx_drain_callback. */
> +static int ti_csi2rx_drain_dma(struct ti_csi2rx_ctx *ctx);
> +static int ti_csi2rx_dma_submit_pending(struct ti_csi2rx_ctx *ctx);
> +
> static const struct ti_csi2rx_fmt *find_format_by_fourcc(u32 pixelformat)
> {
> unsigned int i;
> @@ -589,9 +591,28 @@ static void ti_csi2rx_setup_shim(struct ti_csi2rx_ctx *ctx)
>
> static void ti_csi2rx_drain_callback(void *param)
> {
> - struct completion *drain_complete = param;
> + struct ti_csi2rx_ctx *ctx = param;
> + struct ti_csi2rx_dma *dma = &ctx->dma;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&dma->lock, flags);
>
> - complete(drain_complete);
> + if (dma->state == TI_CSI2RX_DMA_STOPPED) {
> + spin_unlock_irqrestore(&dma->lock, flags);
> + return;
> + }
> +
> + /*
> + * If dma->queue is empty, it signals no buffer has arrived from
> + * user space, so, queue more transaction to drain dma
> + */
> + if (list_empty(&dma->queue)) {
> + if (ti_csi2rx_drain_dma(ctx))
> + dev_warn(ctx->csi->dev, "DMA drain failed\n");
> + } else {
> + ti_csi2rx_dma_submit_pending(ctx);
> + }
> + spin_unlock_irqrestore(&dma->lock, flags);
> }
>
> /*
> @@ -609,12 +630,9 @@ static int ti_csi2rx_drain_dma(struct ti_csi2rx_ctx *ctx)
> {
> struct ti_csi2rx_dev *csi = ctx->csi;
> struct dma_async_tx_descriptor *desc;
> - struct completion drain_complete;
> dma_cookie_t cookie;
> int ret;
>
> - init_completion(&drain_complete);
> -
> desc = dmaengine_prep_slave_single(ctx->dma.chan, csi->drain.paddr,
> csi->drain.len, DMA_DEV_TO_MEM,
> DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> @@ -624,7 +642,7 @@ static int ti_csi2rx_drain_dma(struct ti_csi2rx_ctx *ctx)
> }
>
> desc->callback = ti_csi2rx_drain_callback;
> - desc->callback_param = &drain_complete;
> + desc->callback_param = ctx;
>
> cookie = dmaengine_submit(desc);
> ret = dma_submit_error(cookie);
> @@ -633,13 +651,6 @@ static int ti_csi2rx_drain_dma(struct ti_csi2rx_ctx *ctx)
>
> dma_async_issue_pending(ctx->dma.chan);
>
> - if (!wait_for_completion_timeout(&drain_complete,
> - msecs_to_jiffies(DRAIN_TIMEOUT_MS))) {
> - dmaengine_terminate_sync(ctx->dma.chan);
> - dev_dbg(csi->dev, "DMA transfer timed out for drain buffer\n");
> - ret = -ETIMEDOUT;
> - goto out;
> - }
> out:
> return ret;
> }
> @@ -687,9 +698,11 @@ static void ti_csi2rx_dma_callback(void *param)
>
> ti_csi2rx_dma_submit_pending(ctx);
>
> - if (list_empty(&dma->submitted))
> - dma->state = TI_CSI2RX_DMA_IDLE;
> -
> + if (list_empty(&dma->submitted)) {
> + if (ti_csi2rx_drain_dma(ctx))
> + dev_warn(ctx->csi->dev,
> + "DMA drain failed on one of the transactions\n");
Checkpatch warning here:
CHECK: Alignment should match open parenthesis
#138: FILE: drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c:704:
+ dev_warn(ctx->csi->dev,
+ "DMA drain failed on one of the transactions\n");
total: 0 errors, 0 warnings, 1 checks, 167 lines checked
> + }
> spin_unlock_irqrestore(&dma->lock, flags);
> }
>
> @@ -742,7 +755,7 @@ static void ti_csi2rx_stop_dma(struct ti_csi2rx_ctx *ctx)
> * enforced before terminating DMA.
> */
> ret = ti_csi2rx_drain_dma(ctx);
> - if (ret && ret != -ETIMEDOUT)
> + if (ret)
> dev_warn(ctx->csi->dev,
> "Failed to drain DMA. Next frame might be bogus\n");
> }
> @@ -809,57 +822,14 @@ static void ti_csi2rx_buffer_queue(struct vb2_buffer *vb)
> struct ti_csi2rx_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> struct ti_csi2rx_buffer *buf;
> struct ti_csi2rx_dma *dma = &ctx->dma;
> - bool restart_dma = false;
> unsigned long flags = 0;
> - int ret;
>
> buf = container_of(vb, struct ti_csi2rx_buffer, vb.vb2_buf);
> buf->ctx = ctx;
>
> spin_lock_irqsave(&dma->lock, flags);
> - /*
> - * Usually the DMA callback takes care of queueing the pending buffers.
> - * But if DMA has stalled due to lack of buffers, restart it now.
> - */
> - if (dma->state == TI_CSI2RX_DMA_IDLE) {
> - /*
> - * Do not restart DMA with the lock held because
> - * ti_csi2rx_drain_dma() might block for completion.
> - * There won't be a race on queueing DMA anyway since the
> - * callback is not being fired.
> - */
> - restart_dma = true;
> - dma->state = TI_CSI2RX_DMA_ACTIVE;
> - } else {
> - list_add_tail(&buf->list, &dma->queue);
> - }
> + list_add_tail(&buf->list, &dma->queue);
> spin_unlock_irqrestore(&dma->lock, flags);
> -
> - if (restart_dma) {
> - /*
> - * Once frames start dropping, some data gets stuck in the DMA
> - * pipeline somewhere. So the first DMA transfer after frame
> - * drops gives a partial frame. This is obviously not useful to
> - * the application and will only confuse it. Issue a DMA
> - * transaction to drain that up.
> - */
> - ret = ti_csi2rx_drain_dma(ctx);
> - if (ret && ret != -ETIMEDOUT)
> - dev_warn(ctx->csi->dev,
> - "Failed to drain DMA. Next frame might be bogus\n");
> -
> - spin_lock_irqsave(&dma->lock, flags);
> - ret = ti_csi2rx_start_dma(ctx, buf);
> - if (ret) {
> - vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
> - dma->state = TI_CSI2RX_DMA_IDLE;
> - spin_unlock_irqrestore(&dma->lock, flags);
> - dev_err(ctx->csi->dev, "Failed to start DMA: %d\n", ret);
> - } else {
> - list_add_tail(&buf->list, &dma->submitted);
> - spin_unlock_irqrestore(&dma->lock, flags);
> - }
> - }
> }
>
> static int ti_csi2rx_get_vc(struct ti_csi2rx_ctx *ctx)
> --
> 2.34.1
>
Thanks,
Jai
Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)
Powered by blists - more mailing lists