[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAAFQd5BaEt0d_Xury-taUbmzU++5La8y65+=Zie-QFNyEY9BEg@mail.gmail.com>
Date: Wed, 24 Dec 2025 15:09:04 +0900
From: Tomasz Figa <tfiga@...omium.org>
To: Nicolas Dufresne <nicolas@...fresne.ca>
Cc: Hans Verkuil <hverkuil+cisco@...nel.org>, Hirokazu Honda <hiroh@...omium.org>,
Dmitry Osipenko <dmitry.osipenko@...labora.com>, Mauro Carvalho Chehab <mchehab@...nel.org>,
Benjamin Gaignard <benjamin.gaignard@...labora.com>,
Daniel Almeida <daniel.almeida@...labora.com>, linux-media@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1] media: videobuf2: Allow applications customize data
offsets of capture buffers
On Sat, Dec 20, 2025 at 12:18 AM Nicolas Dufresne <nicolas@...fresne.ca> wrote:
>
> Hi Hans,
>
> Le mercredi 17 décembre 2025 à 11:02 +0100, Hans Verkuil a écrit :
> > > For me, the most central issue in V4L2 is that the memory allocation/importation
> > > is bound to the operation queues. That brings all sort of issues such
> > >
> > > - We can't queue twice the same frame
I think one could just call QBUF with the same buffer index twice (or more).
> > > - We can't mix external buffer with device allocated buffer
VIDIOC_CREATE_BUFS accepts a memory type, so one could use it to
create ranges of indexes for different buffer types.
> > > - All buffers must have the exact same stride
This one is less straightforward because V4L2 currently specifies
stride as part of the format, not the buffer. In general, buffers in
vb2 are just memory planes; and only specific video operations
interpret them as a particular format.
So, this is where we might need a brand new UAPI. Although one could
also argue that VIDIOC_S_FMT should only apply to buffers that are
about to be queued, so one can just call it before queuing a buffer
that has a different stride than previous buffers (worst-case
scenario: before every VIDIOC_QBUF).
> >
> > The three limitations above are all technically possible to implement with the
> > current vb2 framework/streaming uAPI, it's just that nobody was ever motivated
> > enough to add support for it.
>
> I don't see how technically it is possible without new uAPI to support
> heterogeneous strides, nor queuing twice the same frame while running in MMAP,
> or mix device memory and externally imported memory. Please feel free to
> enlighten me if you have some spare time.
>
> Perhaps worth to mention that this is about doing this without creating glitches
> or jumps cause by expensive drain and streamoff/on sequences.
Some random ideas above ^ :)
>
> >
> > > - Application is responsible for caching which memory goes to which v4l2_buffer
> >
> > True, but is this really a big deal?
>
> Maybe a lesser deal, but its extra complexity for both sides. The current bug
> being that if you use import mode only to workaround "queuing twice" issue, you
> will endup with two mapping (which in some stateful codec firmware is not
> allowed by firmwares). So on top of the lookup userspace is doing to match
> buffer ids to their memory (have to cache pointers and fd and all), the driver
> (or vb2) should also implement caching. My proposal imply solving that dual
> mapping issue for both current and future stream mode.
>
> In userspace, there is also cases, where the video buffers comes from other
> process, and you don't really know if two FD values points to the same dmabuf.
> This is the kind of scenarios the DRM subsystem had to deal in compositors, in
> our case that would be something such as pipewire. This caching is either micro
> optimization or simply to support firmware limitation, but a guarantee to have 1
> memory object for one chunk is in my opinion achievable and allow reducing
> complexity.
This is actually a big deal because, with DMA-bufs, the video code
often only receives file descriptors and no unique buffer identifiers.
Consequently, the video code has no reliable way to determine which
physical buffer a received file descriptor maps to.
And of course if one doesn't match the buffers with the indexes, they
get thrashing of vb2 mappings and a bad performance penalty.
Comparing that to the DRM UAPI, the import (prime fd to handle)
operation always returns the same buffer handle if a buffer is
imported more than once, which uniquely identifies the physical buffer
and can be used for this kind of mapping optimization tricks.
Alternatively, we could fix this in the kernel by separating vb2
mappings from buffer indexes and associating them with DMA-buf
attachments instead, but last time I spent some time evaluating that
it led into some complex lifetime management issues, e.g. when to
destroy such a mapping.
Best regards,
Tomasz
Powered by blists - more mailing lists