lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c14c5a8b309ffcea723cee66430a59ee57b73e5f.camel@collabora.com>
Date: Wed, 10 Sep 2025 08:56:46 -0400
From: Nicolas Dufresne <nicolas.dufresne@...labora.com>
To: "jackson.lee" <jackson.lee@...psnmedia.com>, "mchehab@...nel.org"	
 <mchehab@...nel.org>, "hverkuil-cisco@...all.nl"
 <hverkuil-cisco@...all.nl>,  "bob.beckett@...labora.com"	
 <bob.beckett@...labora.com>
Cc: "linux-media@...r.kernel.org" <linux-media@...r.kernel.org>, 
 "linux-kernel@...r.kernel.org"	 <linux-kernel@...r.kernel.org>,
 "lafley.kim" <lafley.kim@...psnmedia.com>,  "b-brnich@...com"	
 <b-brnich@...com>, "hverkuil@...all.nl" <hverkuil@...all.nl>, Nas Chung	
 <nas.chung@...psnmedia.com>
Subject: Re: [PATCH v3 0/4] Performance improvement of decoder

Hi Jackson,

Le mercredi 10 septembre 2025 à 06:59 +0000, jackson.lee a écrit :
[...]

> I have reproduced the stall problem, I can see it with the latest Gstreamer version.
> The root cause is we checked an incorrect return value while flushing, so in spite of not finished flushing, the checking loop if the flushing was finished was exited.
> When stop streaming was called and the instance queue count was 1,  the checking function put infinite loop, so the stall problem happened.
> 
> The below patch should be needed.
> 
> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
> index edbe69540ef1..2e0128cd0e4d 100644
> --- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
> @@ -52,6 +52,7 @@ int wave5_vpu_init_with_bitcode(struct device *dev, u8 *bitcode, size_t size)
>  int wave5_vpu_flush_instance(struct vpu_instance *inst)
>  {
>         int ret = 0;
> +       int mutex_ret = 0;
>         int retry = 0;
> 
>         ret = mutex_lock_interruptible(&inst->dev->hw_lock);
> @@ -80,9 +81,9 @@ int wave5_vpu_flush_instance(struct vpu_instance *inst)
> 
>                         mutex_unlock(&inst->dev->hw_lock);
>                         wave5_vpu_dec_get_output_info(inst, &dec_info);
> -                       ret = mutex_lock_interruptible(&inst->dev->hw_lock);
> -                       if (ret)
> -                               return ret;
> +                       mutex_ret = mutex_lock_interruptible(&inst->dev->hw_lock);
> +                       if (mutex_ret)
> +                               return mutex_ret;
>                         if (dec_info.index_frame_display > 0)
>                                 wave5_vpu_dec_set_disp_flag(inst, dec_info.index_frame_display);
>                 }

Good catch, unfortunately it does not completely fix the problem for me. You can
find a the end of this message the patch I actually tested. Note I ,ove the
mutex_ret in a close scope, and fixed other occurence of this pattern, except
one that I highlighted to you with a FIXME.

Some new information, I had this trace from GStreamer when the bug occured on forward seeks (very rare):

** (gst-play-1.0:604): WARNING **: 00:03:59.965: v4l2h264dec0: Too old frames, bug in decoder -- please file a bug

[root@...into nicolas]# echo w > /proc/sysrq-trigger 
[  335.116289] sysrq: Show Blocked State
[  335.120054] task:typefind:sink   state:D stack:0     pid:607   tgid:604   ppid:543    task_flags:0x40044c flags:0x00000019
[  335.131147] Call trace:
[  335.133584]  __switch_to+0xf0/0x1c0 (T)
[  335.137442]  __schedule+0x35c/0x9bc
[  335.140935]  schedule+0x34/0x110
[  335.144162]  schedule_timeout+0x80/0x104
[  335.148081]  wait_for_completion_timeout+0x74/0x158
[  335.152955]  wave5_vpu_wait_interrupt+0x28/0x60 [wave5]
[  335.158252]  wave5_vpu_dec_stop_streaming+0x68/0x28c [wave5]
[  335.163915]  __vb2_queue_cancel+0x2c/0x2d4 [videobuf2_common]
[  335.169668]  vb2_core_queue_release+0x20/0x74 [videobuf2_common]
[  335.175678]  vb2_queue_release+0x10/0x1c [videobuf2_v4l2]
[  335.181081]  v4l2_m2m_ctx_release+0x20/0x40 [v4l2_mem2mem]
[  335.186567]  wave5_vpu_release_device+0x44/0x150 [wave5]
[  335.191879]  wave5_vpu_dec_release+0x20/0x2c [wave5]
[  335.196841]  v4l2_release+0xb4/0xf0 [videodev]
[  335.201709]  __fput+0xd0/0x2e0
[  335.205090]  ____fput+0x14/0x20
[  335.208468]  task_work_run+0x64/0xd4
[  335.212164]  do_exit+0x240/0x8e0
[  335.215552]  do_group_exit+0x30/0xa4
[  335.219177]  get_signal+0x790/0x860
[  335.222676]  do_signal+0x94/0x394
[  335.225986]  do_notify_resume+0xd0/0x14c
[  335.229910]  el0_svc+0xe4/0xe8
[  335.232967]  el0t_64_sync_handler+0xa0/0xe4
[  335.237154]  el0t_64_sync+0x198/0x19c

regards,
Nicolas

---

diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
index edbe69540ef1e..2faca2eee41fe 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
@@ -77,12 +77,13 @@ int wave5_vpu_flush_instance(struct vpu_instance *inst)
                        return -ETIMEDOUT;
                } else if (ret == -EBUSY) {
                        struct dec_output_info dec_info;
+                       int ret_mutex;
 
                        mutex_unlock(&inst->dev->hw_lock);
                        wave5_vpu_dec_get_output_info(inst, &dec_info);
-                       ret = mutex_lock_interruptible(&inst->dev->hw_lock);
-                       if (ret)
-                               return ret;
+                       ret_mutex = mutex_lock_interruptible(&inst->dev->hw_lock);
+                       if (ret_mutex)
+                               return ret_mutex;
                        if (dec_info.index_frame_display > 0)
                                wave5_vpu_dec_set_disp_flag(inst, dec_info.index_frame_display);
                }
@@ -222,6 +223,8 @@ int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
        }
 
        do {
+               int ret_mutex;
+
                ret = wave5_vpu_dec_finish_seq(inst, fail_res);
                if (ret < 0 && *fail_res != WAVE5_SYSERR_VPU_STILL_RUNNING) {
                        dev_warn(inst->dev->dev, "dec_finish_seq timed out\n");
@@ -243,10 +246,10 @@ int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
 
                mutex_unlock(&vpu_dev->hw_lock);
                wave5_vpu_dec_get_output_info(inst, &dec_info);
-               ret = mutex_lock_interruptible(&vpu_dev->hw_lock);
-               if (ret) {
+               ret_mutex = mutex_lock_interruptible(&vpu_dev->hw_lock);
+               if (ret_mutex) {
                        pm_runtime_put_sync(inst->dev->dev);
-                       return ret;
+                       return ret_mutex;
                }
        } while (ret != 0);
 
@@ -482,6 +485,7 @@ dma_addr_t wave5_vpu_dec_get_rd_ptr(struct vpu_instance *inst)
 
        ret = mutex_lock_interruptible(&inst->dev->hw_lock);
        if (ret)
+               // FIXME this return type is wrong
                return ret;
 
        rd_ptr = wave5_dec_get_rd_ptr(inst);
nicolas@...tebuilder:~/Sources/TI/jacinto/linux$ git diff
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
index edbe69540ef1e..2faca2eee41fe 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
@@ -77,12 +77,13 @@ int wave5_vpu_flush_instance(struct vpu_instance *inst)
                        return -ETIMEDOUT;
                } else if (ret == -EBUSY) {
                        struct dec_output_info dec_info;
+                       int ret_mutex;
 
                        mutex_unlock(&inst->dev->hw_lock);
                        wave5_vpu_dec_get_output_info(inst, &dec_info);
-                       ret = mutex_lock_interruptible(&inst->dev->hw_lock);
-                       if (ret)
-                               return ret;
+                       ret_mutex = mutex_lock_interruptible(&inst->dev->hw_lock);
+                       if (ret_mutex)
+                               return ret_mutex;
                        if (dec_info.index_frame_display > 0)
                                wave5_vpu_dec_set_disp_flag(inst, dec_info.index_frame_display);
                }
@@ -222,6 +223,8 @@ int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
        }
 
        do {
+               int ret_mutex;
+
                ret = wave5_vpu_dec_finish_seq(inst, fail_res);
                if (ret < 0 && *fail_res != WAVE5_SYSERR_VPU_STILL_RUNNING) {
                        dev_warn(inst->dev->dev, "dec_finish_seq timed out\n");
@@ -243,10 +246,10 @@ int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
 
                mutex_unlock(&vpu_dev->hw_lock);
                wave5_vpu_dec_get_output_info(inst, &dec_info);
-               ret = mutex_lock_interruptible(&vpu_dev->hw_lock);
-               if (ret) {
+               ret_mutex = mutex_lock_interruptible(&vpu_dev->hw_lock);
+               if (ret_mutex) {
                        pm_runtime_put_sync(inst->dev->dev);
-                       return ret;
+                       return ret_mutex;
                }
        } while (ret != 0);
 
@@ -482,6 +485,7 @@ dma_addr_t wave5_vpu_dec_get_rd_ptr(struct vpu_instance *inst)
 
        ret = mutex_lock_interruptible(&inst->dev->hw_lock);
        if (ret)
+               // FIXME this return type is wrong
                return ret;
 
        rd_ptr = wave5_dec_get_rd_ptr(inst);

Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ