[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f79ab2a0db0eb4aad20ed488de3635f9d8942cdf.camel@collabora.com>
Date: Fri, 29 Aug 2025 11:40:59 -0400
From: Nicolas Dufresne <nicolas.dufresne@...labora.com>
To: "Jackson.lee" <jackson.lee@...psnmedia.com>, mchehab@...nel.org,
hverkuil-cisco@...all.nl, bob.beckett@...labora.com
Cc: linux-media@...r.kernel.org, linux-kernel@...r.kernel.org,
lafley.kim@...psnmedia.com, b-brnich@...com, hverkuil@...all.nl,
nas.chung@...psnmedia.com
Subject: Re: [PATCH v3 0/4] Performance improvement of decoder
Hi Jackson,
Le lundi 23 juin 2025 à 09:21 +0900, Jackson.lee a écrit :
> From: Jackson Lee <jackson.lee@...psnmedia.com>
>
> v4l2-compliance results:
> ========================
>
> v4l2-compliance 1.28.1-5233, 64 bits, 64-bit time_t
>
> Buffer ioctls:
> warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS not supported
> warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS not supported
> test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
> test CREATE_BUFS maximum buffers: OK
> test VIDIOC_EXPBUF: OK
> test Requests: OK (Not Supported)
>
> Total for wave5-dec device /dev/video0: 46, Succeeded: 46, Failed: 0, Warnings: 2 Total for wave5-enc device /dev/video1: 46, Succeeded: 46, Failed: 0, Warnings: 0
>
> Fluster test results:
> =====================
>
> Running test suite JCT-VC-HEVC_V1 with decoder GStreamer-H.265-V4L2-Gst1.0 Using 3 parallel job(s)
> Ran 133/147 tests successfully in 40.114 secs
>
> (1 test fails because of not supporting to parse multi frames, 1 test fails because of a missing frame and slight corruption,
> 2 tests fail because of sizes which are incompatible with the IP, 11 tests fail because of unsupported 10 bit format)
>
>
> Running test suite JVT-AVC_V1 with decoder GStreamer-H.264-V4L2-Gst1.0 Using 3 parallel job(s)
> Ran 78/135 tests successfully in 43.364 secs
>
> (57 fail because the hardware is unable to decode MBAFF / FMO / Field / Extended profile streams.)
>
> Running test suite JVT-FR-EXT with decoder GStreamer-H.264-V4L2-Gst1.0 Using 3 parallel job(s)
> Ran 25/69 tests successfully in 40.411 secs
Ack, same results here and consistent.
>
> (44 fail because the hardware does not support field encoded and 422 encoded stream)
>
> Seek test
> =====================
> 1. gst-play-1.0 seek.264
> 2. this will use waylandsink since gst-play-1.0 uses playbin.
> if you don't want to hook up display,
> you can run gst-play-1.0 seek.264 --videosink=fakevideosink instead 3. Let pipeline run for 2-3 seconds 4. press SPACE key to pause 5. press 0 to reset press SPACE to start again
>
> gst-play-1.0 seek.264 --videosink=fakevideosink Press 'k' to see a list of keyboard shortcuts.
> Now playing /root/seek.264
> Redistribute latency...
> Redistribute latency...
> Redistribute latency...
> Redistribute latency...
> Redistribute latency...aused
> 0:00:09.9 / 0:00:09.7
> Reached end of play list.
>
> Sequence Change test
> =====================
> gst-launch-1.0 filesrc location=./drc.h264 ! h264parse ! v4l2h264dec ! filesink location=./h264_output_420.yuv Setting pipeline to PAUSED ...
> Pipeline is PREROLLING ...
> Redistribute latency...
> Pipeline is PREROLLED ...
> Setting pipeline to PLAYING ...
> New clock: GstSystemClock
> Redistribute latency...
> Got EOS from element "pipeline0".
> Execution ended after 0:00:00.113620590
> Setting pipeline to NULL ...
> Freeing pipeline ...
I tried and reproduce your results. I've used an ISOMP4 file, nothing big, 720p
10min video. After 30s of seeking back and forth I've got a deadlock, with the
following kernel log:
vdec 4210000.video-codec: wave5_vpu_firmware_command_queue_error_check: still running: 0x1000
I don't know if its worse then before, but the bug is severe enough to be
concern. To reproduce easily, I pick a longer video, seek forward close to the
end, and then seek back (gst-play so smaller steps back) very quickly till it
reaches position 0, and repeat.
This happened without resolution change happening concurrent to seeks, just a
flat, single resolution video. Once I do the same test with an agressive DRC in
place, I hit kernel crash. I will share in private email the DRC H.264 sample
I'm using, and how to make it bigger so its manually seekable.
[ 678.819859] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000358
[ 678.828746] Mem abort info:
[ 678.832378] ESR = 0x0000000096000004
[ 678.838555] EC = 0x25: DABT (current EL), IL = 32 bits
[ 678.845921] SET = 0, FnV = 0
[ 678.849882] EA = 0, S1PTW = 0
[ 678.854241] FSC = 0x04: level 0 translation fault
[ 678.860098] Data abort info:
[ 678.864410] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 678.871000] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 678.877384] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 678.887785] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000882215000
[ 678.901210] [0000000000000358] pgd=0000000000000000, p4d=0000000000000000
[ 678.908585] Internal error: Oops: 0000000096000004 [#1] SMP
[ 678.914266] Modules linked in: rfkill qrtr rpmsg_ctrl rpmsg_char phy_cadence_torrent tps6594_esm tps6594_pfsm tps6594_regulator rtc_tps6594 ti_am335x_adc kfifo_buf pinctrl_tps6594 gpio_regmap cdns3 cdns_usb_common mux_gpio omap_mailbox ti_k3_r5_remoteproc phy_j721e_wiz phy_can_transceiver wave5 v4l2_mem2mem powervr videobuf2_dma_contig drm_gpuvm videobuf2_memops videobuf2_v4l2 drm_exec at24 drm_shmem_helper tps6594_i2c videodev gpu_sched tps6594_core videobuf2_common k3_j72xx_bandgap ti_k3_dsp_remoteproc mc drm_kms_helper ti_k3_common sa2ul ti_am335x_tscadc authenc m_can_platform m_can can_dev cdns3_ti rti_wdt fuse drm dm_mod backlight ipv6
[ 678.971012] CPU: 1 UID: 0 PID: 51 Comm: kworker/1:1 Not tainted 6.17.0-rc3-jacinto+ #2 PREEMPT
[ 678.979704] Hardware name: Texas Instruments J721S2 EVM (DT)
[ 678.985358] Workqueue: events v4l2_m2m_device_run_work [v4l2_mem2mem]
[ 678.991811] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 678.998767] pc : v4l2_m2m_try_run+0x74/0x13c [v4l2_mem2mem]
[ 679.004345] lr : v4l2_m2m_try_run+0x60/0x13c [v4l2_mem2mem]
[ 679.009922] sp : ffff800083333d60
[ 679.013232] x29: ffff800083333d60 x28: 0000000000000000 x27: 0000000000000000
[ 679.020358] x26: ffff000b7dfa8468 x25: 0000000000000000 x24: ffff000800012205
[ 679.027480] x23: ffff0008011aa300 x22: ffff000b7dfa8440 x21: ffff0008053f2220
[ 679.034602] x20: ffff000800012200 x19: ffff0008053f2000 x18: 0000000000000000
[ 679.041724] x17: 0000000000000000 x16: 0000000000000000 x15: 009f729c552fd3f8
[ 679.048846] x14: 00000000000002ae x13: ffff8000811f4790 x12: 0000000000000537
[ 679.055968] x11: 00000000000000c0 x10: 0000000000000ab0 x9 : ffff800083333c80
[ 679.063090] x8 : ffff0008011aae10 x7 : 0000000000002d02 x6 : 000000000000ba6b
[ 679.070212] x5 : ffff000827f68b40 x4 : ffff0008011aa300 x3 : ffff00080b2bb480
[ 679.077333] x2 : 0000000000000000 x1 : ffff80007a972538 x0 : 0000000000000000
[ 679.084456] Call trace:
[ 679.086893] v4l2_m2m_try_run+0x74/0x13c [v4l2_mem2mem] (P)
[ 679.092462] v4l2_m2m_device_run_work+0x14/0x20 [v4l2_mem2mem]
[ 679.098285] process_one_work+0x150/0x290
[ 679.102294] worker_thread+0x2d0/0x3ec
[ 679.106034] kthread+0x12c/0x210
[ 679.109255] ret_from_fork+0x10/0x20
[ 679.112825] Code: 39530000 370005c0 f9400260 f9412661 (f941ac00)
[ 679.118905] ---[ end trace 0000000000000000 ]---
>
> Change since v2:
> ==================
> * For [PATCH v3 4/4] media: chips-media: wave5: Improve performance of decoder
> - squash v2's #3~#6 to #4 patch of v3
Thanks for this update, I'll check if anything is left appart from stability and
provide feedback. I'm looking forward you input on the disclosed bug I have hit.
Nicolas
>
> Change since v1:
> ===================
> * For [PATCH v2 2/7] media: chips-media: wave5: Improve performance of decoder
> - change log to dbg level
>
> Change since v0:
> ===================
> * For [PATCH v1 2/7] media: chips-media: wave5: Improve performance of decoder
> - separates the previous patch to a few patches
>
> * For [PATCH v1 3/7] media: chips-media: wave5: Fix not to be closed
> - separated from the previous patch of performance improvement of
> decoder
>
> * For [PATCH v1 4/7] media: chips-media: wave5: Use spinlock whenever state is changed
> - separated from the previous patch of performance improvement of
> decoder
>
> * For [PATCH v1 5/7] media: chips-media: wave5: Fix not to free resources normally when
> instance was destroyed
> - separated from the previous patch of performance improvement of
> decoder
>
> * For [PATCH v1 7/7] media: chips-media: wave5: Fix SError of kernel panic when closed
> - separated from the previous patch of performance improvement of
> decoder
>
>
> Jackson Lee (4):
> media: chips-media: wave5: Fix SError of kernel panic when closed
> media: chips-media: wave5: Fix Null reference while testing fluster
> media: chips-media: wave5: Add WARN_ON to check if dec_output_info is
> NULL
> media: chips-media: wave5: Improve performance of decoder
>
> .../platform/chips-media/wave5/wave5-helper.c | 23 ++-
> .../platform/chips-media/wave5/wave5-hw.c | 2 +-
> .../chips-media/wave5/wave5-vpu-dec.c | 139 ++++++++++++------
> .../chips-media/wave5/wave5-vpu-enc.c | 8 +-
> .../platform/chips-media/wave5/wave5-vpu.c | 71 +++++++--
> .../platform/chips-media/wave5/wave5-vpuapi.c | 37 ++---
> .../platform/chips-media/wave5/wave5-vpuapi.h | 11 ++
> .../chips-media/wave5/wave5-vpuconfig.h | 1 +
> 8 files changed, 219 insertions(+), 73 deletions(-)
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists