[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <10bfc6f5aaa02ad5858186ccee1894424fc0dd39.camel@pengutronix.de>
Date: Mon, 15 Feb 2021 11:15:19 +0100
From: Lucas Stach <l.stach@...gutronix.de>
To: Sven Van Asbroeck <thesven73@...il.com>,
Philipp Zabel <pza@...gutronix.de>
Cc: Nicolas Dufresne <nicolas@...fresne.ca>,
Mauro Carvalho Chehab <mchehab@...nel.org>,
Adrian Ratiu <adrian.ratiu@...labora.com>,
Fabio Estevam <festevam@...il.com>,
linux-media <linux-media@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [BUG REPORT] media: coda: mpeg4 decode corruption on i.MX6qp
only
Hi Sven,
Am Freitag, dem 12.02.2021 um 18:52 -0500 schrieb Sven Van Asbroeck:
> Philipp, Fabio,
>
> I was able to verify that the PREs do indeed overrun their allocated ocram area.
>
> Section 38.5.1 of the iMX6QuadPlus manual indicates the ocram size
> required: width(pixels) x 8 lines x 4 bytes. For 2048 pixels max, this
> comes to 64K. This is what the PRE driver allocates. So far, so good.
>
> The trouble starts when we're displaying a section of a much wider
> bitmap. This happens in X when using two displays. e.g.:
> HDMI 1920x1088
> LVDS 1280x800
> X bitmap 3200x1088, left side displayed on HDMI, right side on LVDS.
>
> In such a case, the stride will be much larger than the width of a
> display scanline.
Urgh, bad tested corner case.
> This is where things start to go very wrong.
>
> I found that the ocram area used by the PREs increases with the
> stride. I experimentally found a formula:
> ocam_used = display_widthx8x4 + (bitmap_width-display_width)x7x4
>
> As the stride increases, the PRE eventually overruns the ocram and...
> ends up in the "ocram aliased" area, where it overwrites the ocram in
> use by the vpu/coda !
>
> I could not find any PRE register setting that changes the used ocram area.
There is no such setting. The PRE always prefetches a doublebuffer of
2x4 scanlines and the scanline size is defined by the store engine
pitch.
The straight forward way to fix this would be to just disable the PRE
when the stride is getting too large, which might not work well with
all userspace requirements, as it effectively disables the ability to
scan GPU tiled surfaces when the stride is getting too large.
I'm not sure if this works in practice, as the PRG address rewriting
might make this harder than it seems, but on could probably try to
rewrite the prefetch start address, input pitch, input width/height and
store pitch of the PRE settings to cover only the area used by the the
CRTC to reduce OCRAM requirements.
Regards,
Lucas
Powered by blists - more mailing lists