[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240229144137.75ae1cde.pekka.paalanen@collabora.com>
Date: Thu, 29 Feb 2024 14:41:37 +0200
From: Pekka Paalanen <pekka.paalanen@...labora.com>
To: Arthur Grillo <arthurgrillo@...eup.net>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@...il.com>, Melissa Wen
<melissa.srw@...il.com>, Maíra Canal
<mairacanal@...eup.net>, Haneen Mohammed <hamohammed.sa@...il.com>, Daniel
Vetter <daniel@...ll.ch>, Maarten Lankhorst
<maarten.lankhorst@...ux.intel.com>, Maxime Ripard <mripard@...nel.org>,
Thomas Zimmermann <tzimmermann@...e.de>, David Airlie <airlied@...il.com>,
Jonathan Corbet <corbet@....net>, dri-devel@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, jeremie.dautheribes@...tlin.com,
miquel.raynal@...tlin.com, thomas.petazzoni@...tlin.com,
seanpaul@...gle.com, marcheu@...gle.com, nicolejadeyee@...gle.com
Subject: Re: [PATCH v3 3/9] drm/vkms: write/update the documentation for
pixel conversion and pixel write functions
On Tue, 27 Feb 2024 15:47:08 -0300
Arthur Grillo <arthurgrillo@...eup.net> wrote:
> On 27/02/24 12:02, Louis Chauvet wrote:
> > Le 26/02/24 - 10:07, Arthur Grillo a écrit :
> >>
> >>
> >> On 26/02/24 05:46, Louis Chauvet wrote:
> >>> Add some documentation on pixel conversion functions.
> >>> Update of outdated comments for pixel_write functions.
> >>>
> >>> Signed-off-by: Louis Chauvet <louis.chauvet@...tlin.com>
> >>> ---
> >>> drivers/gpu/drm/vkms/vkms_composer.c | 4 +++
> >>> drivers/gpu/drm/vkms/vkms_drv.h | 13 ++++++++
> >>> drivers/gpu/drm/vkms/vkms_formats.c | 58 ++++++++++++++++++++++++++++++------
> >>> 3 files changed, 66 insertions(+), 9 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
> >>> index c6d9b4a65809..5b341222d239 100644
> >>> --- a/drivers/gpu/drm/vkms/vkms_composer.c
> >>> +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> >>> @@ -189,6 +189,10 @@ static void blend(struct vkms_writeback_job *wb,
> >>>
> >>> size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> >>>
> >>> + /*
> >>> + * The planes are composed line-by-line. It is a necessary complexity to avoid poor
> >>> + * blending performance.
> >>
> >> At this moment in the series, you have not yet reintroduced the
> >> line-by-line algorithm yet. Maybe it's better to add this comment when
> >> you do.
> >
> > Is it better with this:
> >
> > /*
> > * The planes are composed line-by-line to avoid heavy memory usage. It is a necessary
> > * complexity to avoid poor blending performance.
> > *
> > * The function vkms_compose_row is used to read a line, pixel-by-pixel, into the staging
> > * buffer.
> > */
> >
> >> Also, I think it's good to give more context, like:
> >> "The planes are composed line-by-line, instead of pixel-by-pixel"
> >
> > And after PATCHv3 5/9:
> >
> > /*
> > * The planes are composed line-by-line to avoid heavy memory usage. It is a necessary
> > * complexity to avoid poor blending performance.
> > *
> > * The function pixel_read_line callback is used to read a line, using an efficient
> > * algorithm for a specific format, into the staging buffer.
> > */
> >
Hi,
there are a few reasons for the line-by-line algorithm, and the
optimizations at large:
VKMS uses temporary stage and output buffers so that blending functions
can operate on just one high-precision pixel format, struct
pixel_argb_u16. We can make pixel-format-specific read and write
functions completely orthogonal from the blending operations and FB
format combinations. This avoids a combinatorial explosion of needed
functions for { input pixel formats × blending operations × output pixel
formats }.
We can use a temporary stage and output buffer whose size is one line
and not whole FB or CRTC framebuffer. This is the memory savings.
Using a temporary output buffer also avoids repeated
read-decode-blend-encode-write cycles into the final destination
buffer, as we don't need to decode/encode the pixel format.
Finally, doing elementary operations (read, blend, write) line-by-line
is much more efficient than pixel-by-pixel, because it allows making
the inner-most loop very tight. It avoids repeatedly computing a result
that does not change, like which function to call for a specific pixel
format or blending equation.
Thanks,
pq
Content of type "application/pgp-signature" skipped
Powered by blists - more mailing lists