[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a23922e98926f14158d1cb8525026d7cdfea7b52.camel@collabora.com>
Date: Mon, 15 Sep 2025 16:14:18 -0400
From: Nicolas Dufresne <nicolas.dufresne@...labora.com>
To: "jackson.lee" <jackson.lee@...psnmedia.com>, "mchehab@...nel.org"
<mchehab@...nel.org>, "hverkuil-cisco@...all.nl"
<hverkuil-cisco@...all.nl>, "bob.beckett@...labora.com"
<bob.beckett@...labora.com>
Cc: "linux-media@...r.kernel.org" <linux-media@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"lafley.kim" <lafley.kim@...psnmedia.com>, "b-brnich@...com"
<b-brnich@...com>, "hverkuil@...all.nl" <hverkuil@...all.nl>, Nas Chung
<nas.chung@...psnmedia.com>
Subject: Re: [PATCH v3 0/4] Performance improvement of decoder
Le lundi 15 septembre 2025 à 07:33 +0000, jackson.lee a écrit :
> Hi Nicolas
>
> > Good catch, unfortunately it does not completely fix the problem for me.
> > You can find a the end of this message the patch I actually tested. Note
> > I ,ove the mutex_ret in a close scope, and fixed other occurence of this
> > pattern, except one that I highlighted to you with a FIXME.
> >
> > Some new information, I had this trace from GStreamer when the bug occured
> > on forward seeks (very rare):
> >
> > ** (gst-play-1.0:604): WARNING **: 00:03:59.965: v4l2h264dec0: Too old
> > frames, bug in decoder -- please file a bug
> >
> > [root@...into nicolas]# echo w > /proc/sysrq-trigger [ 335.116289] sysrq:
> > Show Blocked State
> > [ 335.120054] task:typefind:sink state:D stack:0 pid:607 tgid:604
> > ppid:543 task_flags:0x40044c flags:0x00000019
> > [ 335.131147] Call trace:
> > [ 335.133584] __switch_to+0xf0/0x1c0 (T) [ 335.137442]
> > __schedule+0x35c/0x9bc [ 335.140935] schedule+0x34/0x110 [ 335.144162]
> > schedule_timeout+0x80/0x104 [ 335.148081]
> > wait_for_completion_timeout+0x74/0x158
> > [ 335.152955] wave5_vpu_wait_interrupt+0x28/0x60 [wave5] [ 335.158252]
> > wave5_vpu_dec_stop_streaming+0x68/0x28c [wave5] [ 335.163915]
> > __vb2_queue_cancel+0x2c/0x2d4 [videobuf2_common] [ 335.169668]
> > vb2_core_queue_release+0x20/0x74 [videobuf2_common] [ 335.175678]
> > vb2_queue_release+0x10/0x1c [videobuf2_v4l2] [ 335.181081]
> > v4l2_m2m_ctx_release+0x20/0x40 [v4l2_mem2mem] [ 335.186567]
> > wave5_vpu_release_device+0x44/0x150 [wave5] [ 335.191879]
> > wave5_vpu_dec_release+0x20/0x2c [wave5] [ 335.196841]
> > v4l2_release+0xb4/0xf0 [videodev] [ 335.201709] __fput+0xd0/0x2e0
> > [ 335.205090] ____fput+0x14/0x20 [ 335.208468] task_work_run+0x64/0xd4
> > [ 335.212164] do_exit+0x240/0x8e0 [ 335.215552] do_group_exit+0x30/0xa4
> > [ 335.219177] get_signal+0x790/0x860 [ 335.222676] do_signal+0x94/0x394
> > [ 335.225986] do_notify_resume+0xd0/0x14c [ 335.229910]
> > el0_svc+0xe4/0xe8 [ 335.232967] el0t_64_sync_handler+0xa0/0xe4
> > [ 335.237154] el0t_64_sync+0x198/0x19c
> >
> > regards,
> > Nicolas
> >
>
>
> There were two problems.
>
> One is a logic double linked list managing, it caused crash.
> I added linked list(avail_src_bufs) to manage buffer list queued in order to improve performance of decoder.
> But when it was deleted from list, the prev and next pointer had the previous pointer address, so sometimes it caused crash and deadlock.
>
> Second, when stop_call function was called, sometimes infinite loop was put. But it is needed since flush was added into streamoff_output.
>
> To fix the above problems (deadlock and crash), the below patch is needed.
>
>
>
> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
> index 4554a24df8a1..9572573ce606 100644
> --- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
> @@ -1139,6 +1139,24 @@ static int write_to_ringbuffer(struct vpu_instance *inst, void *buffer, size_t b
> return 0;
> }
>
> +static struct vpu_src_buffer *inst_buf_remove(struct vpu_instance *inst)
> +{
> + struct vpu_src_buffer *b;
> +
> + if (list_empty(&inst->avail_src_bufs))
> + return NULL;
> + inst->queued_count--;
> + b = list_first_entry(&inst->avail_src_bufs, struct vpu_src_buffer, list);
> + list_del(&b->list);
> + b->list.prev = b->list.next = NULL;
> + INIT_LIST_HEAD(&b->list);
> + if (inst->queued_count == 0) {
> + inst->avail_src_bufs.prev = inst->avail_src_bufs.next = NULL;
> + INIT_LIST_HEAD(&inst->avail_src_bufs);
> + }
> + return b;
> +}
> +
> static int fill_ringbuffer(struct vpu_instance *inst)
> {
> struct v4l2_m2m_ctx *m2m_ctx = inst->v4l2_fh.m2m_ctx;
> @@ -1154,7 +1172,7 @@ static int fill_ringbuffer(struct vpu_instance *inst)
> }
> }
>
> - list_for_each_entry(vpu_buf, &inst->avail_src_bufs, list) {
> + while((vpu_buf = inst_buf_remove(inst)) != NULL) {
Can't you make sure of list_for_each_safe() ?
Nicolas
> struct vb2_v4l2_buffer *vbuf = &vpu_buf->v4l2_m2m_buf.vb;
> struct vpu_buf *ring_buffer = &inst->bitstream_vbuf;
> size_t src_size = vb2_get_plane_payload(&vbuf->vb2_buf, 0);
> @@ -1217,7 +1235,6 @@ static int fill_ringbuffer(struct vpu_instance *inst)
> }
>
> inst->queuing_num++;
> - list_del_init(&vpu_buf->list);
> break;
> }
>
> @@ -1233,13 +1250,9 @@ static void wave5_vpu_dec_buf_queue_src(struct vb2_buffer *vb)
>
> vpu_buf->consumed = false;
> vbuf->sequence = inst->queued_src_buf_num++;
> -
> - v4l2_m2m_buf_queue(m2m_ctx, vbuf);
> -
> - INIT_LIST_HEAD(&vpu_buf->list);
> - mutex_lock(&inst->feed_lock);
> list_add_tail(&vpu_buf->list, &inst->avail_src_bufs);
> - mutex_unlock(&inst->feed_lock);
> + inst->queued_count++;
> + v4l2_m2m_buf_queue(m2m_ctx, vbuf);
> }
>
> static void wave5_vpu_dec_buf_queue_dst(struct vb2_buffer *vb)
> @@ -1292,9 +1305,12 @@ static void wave5_vpu_dec_buf_queue(struct vb2_buffer *vb)
> vb2_plane_size(&vbuf->vb2_buf, 1), vb2_plane_size(&vbuf->vb2_buf, 2));
>
> if (vb->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
> + mutex_lock(&inst->feed_lock);
> + wave5_vpu_dec_buf_queue_src(vb);
> +
> if (inst->empty_queue)
> inst->empty_queue = false;
> - wave5_vpu_dec_buf_queue_src(vb);
> + mutex_unlock(&inst->feed_lock);
> } else if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) {
> wave5_vpu_dec_buf_queue_dst(vb);
> }
> @@ -1396,9 +1412,13 @@ static int streamoff_output(struct vb2_queue *q)
>
> inst->retry = false;
> inst->queuing_num = 0;
> -
> - list_for_each_entry_safe(vpu_buf, tmp, &inst->avail_src_bufs, list)
> - list_del_init(&vpu_buf->list);
> + inst->queued_count = 0;
> + mutex_lock(&inst->feed_lock);
> + list_for_each_entry_safe(vpu_buf, tmp, &inst->avail_src_bufs, list) {
> + vpu_buf->consumed = false;
> + list_del(&vpu_buf->list);
> + }
> + mutex_unlock(&inst->feed_lock);
>
> for (i = 0; i < v4l2_m2m_num_dst_bufs_ready(m2m_ctx); i++) {
> ret = wave5_vpu_dec_set_disp_flag(inst, i);
> @@ -1484,28 +1504,9 @@ static void wave5_vpu_dec_stop_streaming(struct vb2_queue *q)
> {
> struct vpu_instance *inst = vb2_get_drv_priv(q);
> struct v4l2_m2m_ctx *m2m_ctx = inst->v4l2_fh.m2m_ctx;
> - bool check_cmd = TRUE;
>
> dev_dbg(inst->dev->dev, "%s: type: %u\n", __func__, q->type);
> pm_runtime_resume_and_get(inst->dev->dev);
> - inst->empty_queue = false;
> - inst->sent_eos = false;
> -
> - while (check_cmd) {
> - struct queue_status_info q_status;
> - struct dec_output_info dec_output_info;
> -
> - wave5_vpu_dec_give_command(inst, DEC_GET_QUEUE_STATUS, &q_status);
> -
> - if ((inst->state == VPU_INST_STATE_STOP || q_status.instance_queue_count == 0) &&
> - q_status.report_queue_count == 0)
> - break;
> -
> - wave5_vpu_wait_interrupt(inst, VPU_DEC_STOP_TIMEOUT);
> -
> - if (wave5_vpu_dec_get_output_info(inst, &dec_output_info))
> - dev_dbg(inst->dev->dev, "there is no output info\n");
> - }
>
> v4l2_m2m_update_stop_streaming_state(m2m_ctx, q);
>
> @@ -1514,6 +1515,8 @@ static void wave5_vpu_dec_stop_streaming(struct vb2_queue *q)
> else
> streamoff_capture(q);
>
> + inst->empty_queue = false;
> + inst->sent_eos = false;
> pm_runtime_mark_last_busy(inst->dev->dev);
> pm_runtime_put_autosuspend(inst->dev->dev);
> }
> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu.h b/drivers/media/platform/chips-media/wave5/wave5-vpu.h
> index 3847332551fc..2e7a6b0bf74a 100644
> --- a/drivers/media/platform/chips-media/wave5/wave5-vpu.h
> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu.h
> @@ -22,8 +22,8 @@
>
> struct vpu_src_buffer {
> struct v4l2_m2m_buffer v4l2_m2m_buf;
> - struct list_head list;
> bool consumed;
> + struct list_head list;
> };
>
> struct vpu_dst_buffer {
>
> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
> index edbe69540ef1..fb41890fc9c1 100644
> --- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
> @@ -52,11 +52,12 @@ int wave5_vpu_init_with_bitcode(struct device *dev, u8 *bitcode, size_t size)
> int wave5_vpu_flush_instance(struct vpu_instance *inst)
> {
> int ret = 0;
> + int mutex_ret = 0;
> int retry = 0;
>
> - ret = mutex_lock_interruptible(&inst->dev->hw_lock);
> - if (ret)
> - return ret;
> + mutex_ret = mutex_lock_interruptible(&inst->dev->hw_lock);
> + if (mutex_ret)
> + return mutex_ret;
> do {
> /*
> * Repeat the FLUSH command until the firmware reports that the
> @@ -80,15 +81,16 @@ int wave5_vpu_flush_instance(struct vpu_instance *inst)
>
> mutex_unlock(&inst->dev->hw_lock);
> wave5_vpu_dec_get_output_info(inst, &dec_info);
> - ret = mutex_lock_interruptible(&inst->dev->hw_lock);
> - if (ret)
> - return ret;
> + mutex_ret = mutex_lock_interruptible(&inst->dev->hw_lock);
> + if (mutex_ret) {
> + printk("lock fail...... \n");
> + return mutex_ret;
> + }
> if (dec_info.index_frame_display > 0)
> wave5_vpu_dec_set_disp_flag(inst, dec_info.index_frame_display);
> }
> } while (ret != 0);
> mutex_unlock(&inst->dev->hw_lock);
> -
> return ret;
> }
>
> @@ -208,6 +210,7 @@ int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
> struct vpu_device *vpu_dev = inst->dev;
> int i;
> struct dec_output_info dec_info;
> + int ret_mutex;
>
> *fail_res = 0;
> if (!inst->codec_info)
> @@ -215,10 +218,10 @@ int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
>
> pm_runtime_resume_and_get(inst->dev->dev);
>
> - ret = mutex_lock_interruptible(&vpu_dev->hw_lock);
> - if (ret) {
> + ret_mutex = mutex_lock_interruptible(&vpu_dev->hw_lock);
> + if (ret_mutex) {
> pm_runtime_put_sync(inst->dev->dev);
> - return ret;
> + return ret_mutex;
> }
>
> do {
> @@ -243,10 +246,10 @@ int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
>
> mutex_unlock(&vpu_dev->hw_lock);
> wave5_vpu_dec_get_output_info(inst, &dec_info);
> - ret = mutex_lock_interruptible(&vpu_dev->hw_lock);
> - if (ret) {
> + ret_mutex = mutex_lock_interruptible(&vpu_dev->hw_lock);
> + if (ret_mutex) {
> pm_runtime_put_sync(inst->dev->dev);
> - return ret;
> + return ret_mutex;
> }
> } while (ret != 0);
>
> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h
> index d9da80d64278..10f89f327436 100644
> --- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h
> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h
> @@ -822,6 +822,7 @@ struct vpu_instance {
> bool sent_eos; /* check if EOS is sent to application */
> bool retry; /* retry to feed bitstream if failure reason is WAVE5_SYSERR_QUEUEING_FAIL*/
> int queuing_num; /* count of bitstream queued */
> + int queued_count;
> struct mutex feed_lock; /* lock for feeding bitstream buffers */
> bool empty_queue;
> struct vpu_buf bitstream_vbuf;
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists