lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <be4c31cf8f37d599012ad9d7aa68468a0d307a2c.camel@collabora.com>
Date: Tue, 12 Aug 2025 14:52:54 -0400
From: Nicolas Dufresne <nicolas.dufresne@...labora.com>
To: Jonas Karlman <jonas@...boo.se>
Cc: Ezequiel Garcia <ezequiel@...guardiasur.com.ar>, Detlev Casanova	
 <detlev.casanova@...labora.com>, Mauro Carvalho Chehab
 <mchehab@...nel.org>,  Alex Bee <knaerzche@...il.com>,
 linux-media@...r.kernel.org, linux-rockchip@...ts.infradead.org, 
	devicetree@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 0/7] media: rkvdec: Add HEVC backend

Le mardi 12 août 2025 à 14:26 -0400, Nicolas Dufresne a écrit :
> Hi Jonas,
> 
> Le mardi 12 août 2025 à 19:31 +0200, Jonas Karlman a écrit :
> > On 8/12/2025 2:44 PM, Nicolas Dufresne wrote:
> > > I forgot, 
> > > 
> > > Le mardi 12 août 2025 à 08:38 -0400, Nicolas Dufresne a écrit :
> > > > > JCT-VC-HEVC_V1 on GStreamer-H.265-V4L2SL-Gst1.0:
> > > > > 
> > > > > - DBLK_D_VIXS_2 (fail)
> > > > > - DSLICE_A_HHI_5 (fail)
> > > > > - EXT_A_ericsson_4 (fail)
> > > > > - PICSIZE_A_Bossen_1 (error)
> > > > > - PICSIZE_B_Bossen_1 (error)
> > > > > - PICSIZE_C_Bossen_1 (error)
> > > > > - PICSIZE_D_Bossen_1 (error)
> > > > > - SAODBLK_A_MainConcept_4 (fail)
> > > > > - SAODBLK_B_MainConcept_4 (fail)
> > > > > - TSUNEQBD_A_MAIN10_Technicolor_2 (error)
> > > 
> > > I'me getting the same result if I force a single job in fluster. The test
> > > I
> > > posted was with 2 jobs. Detlev found that the iommu reset is required in
> > > more
> > > cases on RK3588/3576, perhaps the HEVC decoder in older hardware needs the
> > > same,
> > > I will try and report.
> > 
> > Vendor kernel [1] check following bits from RKVDEC_REG_INTERRUPT reg to
> > decide if a full HW reset should be done.
> > 
> >   err_mask = RKVDEC_BUF_EMPTY_STA
> >   	   | RKVDEC_BUS_STA
> >   	   | RKVDEC_COLMV_REF_ERR_STA
> >   	   | RKVDEC_ERR_STA
> >   	   | RKVDEC_TIMEOUT_STA;
> > 
> > Adding proper reset support can be rather involved and main reason why
> > this series does not handle it, better suited for a separate future
> > series.
> > 
> > Proper HW reset will require e.g. dt-bindings, DT updates, pmu idle
> > request integration and for rk3328 vendor even moved VPU reset to TF-A.
> > 
> > Doing the iommu detach/attach dance not only on RKVDEC_SOFTRESET_RDY
> > could possible improve some cases, until full reset can be implemented.
> 
> Rockchip is following VSI design of "self reset" on error. But since the iommu
> is part of the device, it also gets reset, which imply having to reprogram it.
> This showed to be very reliable logic, despite RK doing a hard reset.
> 
> Since self reset is documented for RKVDEC_BUS_STA, RKVDEC_ERR_STA,
> RKVDEC_TIMEOUT_STA, it would seem that RKVDEC_BUF_EMPTY_STA is redundant,
> unless
> its asynchronous operation that need to be polled. Possibly something to
> investigate. RKVDEC_BUF_EMPTY_STA and RKVDEC_COLMV_REF_ERR_STA are not
> documented a such, so its not quite logical to reprogram the iommu.
> 
> I don't immediately trust reference software for these type of things, we
> should
> find what works best and have a rationale for. The hard reset is every
> expensive, and hard to upstream.

I did the test, and its not that. There is no error in fact, just corrupted
image. The more parallelism, the more failure. Another important key point, no
mmu faults, so its not that. You also reported flakyness, and rerunning making
it work.

The problem is likely due to some register left to its previous value,
forgotten. If you let it sit, it will PM suspend, and a proper reset happens.
The stream then decodes fine. If you run it concurrently with another, it
decodes from dirt and fails. I think that theory fits a lot better, and is a
very common issue. Adding a hard reset would not fix this one.

Porting to in-ram register is the easiest way to fix that. It really reminds me
of:

  7fcb42b3835e9 media: verisilicon: HEVC: Initialize start_bit field

Which tool quite some time to find.

Nicolas

> 
> Nicolas
> 
> > 
> > [1]
> > https://github.com/Kwiboo/linux-rockchip/blob/linux-6.1-stan-rkr6.1/drivers/video/rockchip/mpp/mpp_rkvdec.c#L924-L931
> > 
> > Regards,
> > Jonas
> > 
> > > 
> > > Nicolas

Download attachment "signature.asc" of type "application/pgp-signature" (196 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ