linux-kernel - Re: [RFC] Stateless codecs: how to refer to reference frames

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bb08f06c-e97a-7bde-bf44-0675c4e626b9@xs4all.nl>
Date:   Wed, 24 Oct 2018 12:03:01 +0100
From:   Hans Verkuil <hverkuil@...all.nl>
To:     Alexandre Courbot <acourbot@...omium.org>,
        Tomasz Figa <tfiga@...omium.org>,
        Paul Kocialkowski <paul.kocialkowski@...tlin.com>,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        Pawel Osciak <posciak@...omium.org>,
        linux-media@...r.kernel.org
Cc:     linux-kernel@...r.kernel.org,
        Maxime Ripard <maxime.ripard@...e-electrons.com>,
        Ezequiel Garcia <ezequiel@...labora.com>
Subject: Re: [RFC] Stateless codecs: how to refer to reference frames

On 10/19/2018 10:40 AM, Hans Verkuil wrote:
> From Alexandre's '[RFC PATCH v3] media: docs-rst: Document m2m stateless
> video decoder interface':
> 
> On 10/19/18 10:09, Alexandre Courbot wrote:
>> Two points being currently discussed have not been changed in this
>> revision due to lack of better idea. Of course this is open to change:
> 
> <snip>
> 
>> * The other hot topic is the use of capture buffer indexes in order to
>>   reference frames. I understand the concerns, but I doesn't seem like
>>   we have come with a better proposal so far - and since capture buffers
>>   are essentially well, frames, using their buffer index to directly
>>   reference them doesn't sound too inappropriate to me. There is also
>>   the restriction that drivers must return capture buffers in queue
>>   order. Do we have any concrete example where this scenario would not
>>   work?
> 
> I'll stick to decoders in describing the issue. Stateless encoders probably
> do not have this issue.
> 
> To recap: the application provides a buffer with compressed data to the
> decoder. After the request is finished the application can dequeue the
> decompressed frame from the capture queue.
> 
> In order to decompress the decoder needs to access previously decoded
> reference frames. The request passed to the decoder contained state
> information containing the buffer index (or indices) of capture buffers
> that contain the reference frame(s).
> 
> This approach puts restrictions on the framework and the application:
> 
> 1) It assumes that the application can predict the capture indices.
> This works as long as there is a simple relationship between the
> buffer passed to the decoder and the buffer you get back.
> 
> But that may not be true for future codecs. And what if one buffer
> produces multiple capture buffers? (E.g. if you want to get back
> decompressed slices instead of full frames to reduce output latency).
> 
> This API should be designed to be future-proof (within reason of course),
> and I am not at all convinced that future codecs will be just as easy
> to predict.
> 
> 2) It assumes that neither drivers nor applications mess with the buffers.
> One case that might happen today is if the DMA fails and a buffer is
> returned marked ERROR and the DMA is retried with the next buffer. There
> is nothing in the spec that prevents you from doing that, but it will mess
> up the capture index numbering. And does the application always know in
> what order capture buffers are queued? Perhaps there are two threads: one
> queueing buffers with compressed data, and the other dequeueing the
> decompressed buffers, and they are running mostly independently.
> 
> 
> I believe that assuming that you can always predict the indices of the
> capture queue is dangerous and asking for problems in the future.
> 
> 
> I am very much in favor of using a dedicated cookie. The application sets
> it for the compressed buffer and the driver copies it to the uncompressed
> capture buffer. It keeps track of the association between capture index
> and cookie. If a compressed buffer decompresses into multiple capture
> buffers, then they will all be associated with the same cookie, so
> that simplifies how you refer to reference frames if they are split
> over multiple buffers.
> 
> The codec controls refer to reference frames by cookie(s).
> 
> For existing applications that use the capture index all you need to do
> is to set the capture index as the cookie value in the output buffer.
> 
> It is my understanding that ChromeOS was using the timestamp as the
> cookie value.
> 
> I have thought about that, but I am not in favor of doing that. One
> reason is that struct timeval comes in various flavors (32 bit, 64 bit,
> and a y2038-compatible 32-bit type in the future).
> 
> The y2038 compat code that we will need concerns me in particular since
> it will mean that the timeval is converted from 32 to 64 bit and back
> again, and that might well mangle the data. I'm not so sure if you can
> stick a 64-bit pointer in the timeval (e.g. the high 32 bits go to into
> the tv_sec field, the low 32 bits go to the usecs). The y2038 conversion
> might mangle the tv_usec field (e.g. divide by 1000000 and add the seconds
> to the tv_sec field).
> 
> I would really prefer an independent 64-bit cookie value that the application
> can set instead of abusing something else.
> 
> I propose to make a union with v4l2_timecode (which nobody uses) and a
> new V4L2_BUF_FLAG_COOKIE flag.
> 
> struct v4l2_buffer_cookie {
> 	__u32 high;
> 	__u32 low;
> };
> 
> And in v4l2_buffer:
> 
> 	union {
> 		struct v4l2_timecode timecode;
> 		struct v4l2_buffer_cookie cookie;
> 	};
> 
> And static inlines:
> 
> void v4l2_buffer_set_cookie(struct v4l2_buffer *buf, __u64 cookie)
> {
> 	buf->cookie.high = cookie >> 32;
> 	buf->cookie.low = cookie & 0xffffffffULL;
> 	buf->flags |= V4L2_BUF_FLAG_COOKIE;
> }
> 
> void v4l2_buffer_set_cookie_ptr(struct v4l2_buffer *buf, void *cookie)
> {
> 	v4l2_buffer_set_cookie(buf, (__u64)cookie);
> }
> 
> __u64 v4l2_buffer_get_cookie(struct v4l2_buffer *buf)
> {
> 	if (!(buf->flags & V4L2_BUF_FLAG_COOKIE))
> 		return 0;
> 	return (((__u64)buf->cookie.high) << 32) | (__u64)buf->cookie.low;
> }
> 
> void *v4l2_buffer_get_cookie_ptr(struct v4l2_buffer *buf)
> {
> 	return (void *)v4l2_buffer_get_cookie(buf);
> }
> 
> Why not just use __u64? Because the alignment in v4l2_buffer is a nightmare.
> Using __u64 would create holes, made even worse by different struct timeval
> sizes depending on the architecture.
> 
> I'm proposing a struct v4l2_ext_buffer together with new streaming ioctls
> during the media summit that has a clean layout and there this can be just
> a __u64.
> 
> I'm calling it a 'cookie' here, but that's just a suggestion. Better
> names are welcome.
> 
> Comments are welcome.
> 
> Regards,
> 
> 	Hans
> 

I added cookie support to vb2 in this branch:

https://git.linuxtv.org/hverkuil/media_tree.git/log/?h=cookie

It's compile tested only.

The core is a new vb2_find_cookie() function that returns the index of the first buffer that matches
the cookie, or -1 if nothing was found.

It only checks buffers that have a cookie and that are in the DEQUEUED or DONE state.

I'm not wedded to the 'int' return code, it could also return a vb2_buffer pointer.

It also returns a single buffer at the moment, so it will have to be changed to support
cases where there are multiple buffers that all belong to the same reference frame
(i.e. slices). But we don't support that anyway at the moment. I'm considering adding
a 'start' index argument so it can be called multiple times to find all buffers with
the same cookie.

No documentation yet, and as mentioned it is only compiled tested.

Regards,

	Hans