linux-kernel - Re: [PATCH v2 2/7] drm/vkms: Add support for multy-planar framebuffers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZbvXHECSBmH0NDZn@localhost.localdomain>
Date: Thu, 1 Feb 2024 18:38:36 +0100
From: Louis Chauvet <louis.chauvet@...tlin.com>
To: Arthur Grillo <arthurgrillo@...eup.net>
Cc: Daniel Vetter <daniel@...ll.ch>, David Airlie <airlied@...il.com>,
	Haneen Mohammed <hamohammed.sa@...il.com>,
	Harry Wentland <harry.wentland@....com>,
	Jonathan Corbet <corbet@....net>,
	Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
	Maxime Ripard <mripard@...nel.org>,
	Maíra Canal <mairacanal@...eup.net>,
	Melissa Wen <melissa.srw@...il.com>,
	Rodrigo Siqueira <rodrigosiqueiramelo@...il.com>,
	Thomas Zimmermann <tzimmermann@...e.de>,
	linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
	linux-doc@...r.kernel.org, miquel.raynal@...tlin.com,
	thomas.petazzoni@...tlin.com, seanpaul@...gle.com,
	nicolejadeyee@...gle.com, marcheu@...gle.com
Subject: Re: [PATCH v2 2/7] drm/vkms: Add support for multy-planar
 framebuffers

Hello,

I think there are some bugs in this implementation of multi-planar 
support (and not mylty-planar, there is a typo in the title), especially 
for the "new" drm_format_info description which uses block_w and block_h.

I will propose two patches [1] solving these issues and hopefully also 
simplifying a bit the composition.

TBH I'm not an expert, it's my first ever contribution in DRM, so please 
don't hesitate to correct me if you thin I missunderstood something, it 
actually took me a bit of time to fully understand the whole series and 
how it interacted with the rest of the vkms driver.

> -static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y)
> +static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y, size_t index)
>  {
>  	struct drm_framebuffer *fb = frame_info->fb;
>  
> -	return fb->offsets[0] + (y * fb->pitches[0])
> -			      + (x * fb->format->cpp[0]);
> +	return fb->offsets[index] + (y * fb->pitches[index])
> +				  + (x * fb->format->cpp[index]);
>  }
>  
>  /*
> @@ -23,27 +23,25 @@ static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int
>   * @frame_info: Buffer metadata
>   * @x: The x(width) coordinate of the 2D buffer
>   * @y: The y(Heigth) coordinate of the 2D buffer
> + * @index: The index of the plane on the 2D buffer
>   *
>   * Takes the information stored in the frame_info, a pair of coordinates, and
> - * returns the address of the first color channel.
> - * This function assumes the channels are packed together, i.e. a color channel
> - * comes immediately after another in the memory. And therefore, this function
> - * doesn't work for YUV with chroma subsampling (e.g. YUV420 and NV21).
> + * returns the address of the first color channel on the desired index.
>   */
>  static void *packed_pixels_addr(const struct vkms_frame_info *frame_info,
> -				int x, int y)
> +				int x, int y, size_t index)
>  {
> -	size_t offset = pixel_offset(frame_info, x, y);
> +	size_t offset = pixel_offset(frame_info, x, y, index);
>  
>  	return (u8 *)frame_info->map[0].vaddr + offset;
>  }

This implementation of packed_pixels_addr will only work with
block_w == block_h == 1. For packed or tiled formats we will need to use
x/y information to extract the correct address, and this address will not 
be a single pixel. See below my explanation.

[...]

> @@ -130,17 +128,28 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
>  {
>  	struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
>  	struct vkms_frame_info *frame_info = plane->frame_info;
> -	u8 *src_pixels = get_packed_src_addr(frame_info, y);
> +	const struct drm_format_info *frame_format = frame_info->fb->format;
>  	int limit = min_t(size_t, drm_rect_width(&frame_info->dst), stage_buffer->n_pixels);
> +	u8 *src_pixels[DRM_FORMAT_MAX_PLANES];
>  
> -	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->fb->format->cpp[0]) {
> +	for (size_t i = 0; i < frame_format->num_planes; i++)
> +		src_pixels[i] = get_packed_src_addr(frame_info, y, i);
> +
> +	for (size_t x = 0; x < limit; x++) {
>  		int x_pos = get_x_position(frame_info, limit, x);
>  
> -		if (drm_rotation_90_or_270(frame_info->rotation))
> -			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1)
> -				+ frame_info->fb->format->cpp[0] * y;
> +		if (drm_rotation_90_or_270(frame_info->rotation)) {
> +			for (size_t i = 0; i < frame_format->num_planes; i++) {
> +				src_pixels[i] = get_packed_src_addr(frame_info,
> +								    x + frame_info->rotated.y1, i);
> +				src_pixels[i] += frame_format->cpp[i] * y;

I find the current rotation management a bit complex to understand. This 
is not related to your patch, but as I had to understand this to create my 
second patch, I think this could be significanlty simplified.

Please see the below comment about frame_format->cpp, it applies here too. 
I think the "easy" way here is simply to reuse the method 
get_packed_src_addr every time you need a new pixel.

> +			}
> +		}
>  
>		plane->pixel_read(src_pixels, &out_pixels[x_pos]);
> +

The usage of cpp and pointer to specific pixel only work for non-packed 
and non-blocked pixels, but for example NV30 or Y0L0 need more 
informations about the exact location of the pixel to convert and write 
the correct pixel value (each pixel can't be referenced directly by a 
pointer). For example NV30 uses 5 bytes to store 3 pixels (10 bits each), 
so to access the "middle" one you need to read the 5 bytes and do a small 
computation to extract it's value.

I think a simple solution to handle most cases would be to profide two 
more parameters: the x and y positions of the pixel to copy, using 
"absolute coordinates" (i.e x=0,y=0 means the first byte of the src 
buffer, not the first pixel in the `drm_rect src`, this way the method 
`pixel_read` can extract the correct value).

This way it become easy to manage "complex" pixel representations in this 
loop: simply increment x/y and let the pixel_read method handle 
everything.

The second patch I will send is doing this. And as explained before, it 
will also simplify a lot the code related to rotation and translation (no 
more switch case everywhere to add offset to x/y, it simply use drm_rect_* 
helpers).

It's not optimal in term of performance (in some situation it will read 
the same block multiple time to generate different pixels), but I 
believe it still is an intersting trade-off.

In the future, if performance is actally critical, the whole composition 
loop will have to be specialized for each pixel formats: some can be 
treated line by line (as it's done today), but with blocks or packed 
pixels it's more complex.

> +		for (size_t i = 0; i < frame_format->num_planes; i++)
> +			src_pixels[i] += frame_format->cpp[i];

This is likely working with format with block_w != 1, see explanation 
above.

[...]

[1]: https://lore.kernel.org/dri-devel/20240201-yuv-v1-0-3ca376f27632@bootlin.com/T/#t

-- 
Louis Chauvet, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com