lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zqol_N8qkMI--n-S@valkosipuli.retiisi.eu>
Date: Wed, 31 Jul 2024 11:54:36 +0000
From: Sakari Ailus <sakari.ailus@....fi>
To: Ricardo Ribalda Delgado <ricardo.ribalda@...il.com>
Cc: Laurent Pinchart <laurent.pinchart@...asonboard.com>,
	Dan Williams <dan.j.williams@...el.com>,
	James Bottomley <James.Bottomley@...senpartnership.com>,
	ksummit@...ts.linux.dev, linux-cxl@...r.kernel.org,
	linux-rdma@...r.kernel.org, netdev@...r.kernel.org, jgg@...dia.com
Subject: Re: [MAINTAINERS SUMMIT] Device Passthrough Considered Harmful?

Hi Ricardo, Laurent,

On Mon, Jul 29, 2024 at 11:58:43AM +0200, Ricardo Ribalda Delgado wrote:
> On Sun, Jul 28, 2024 at 7:18 PM Laurent Pinchart
> <laurent.pinchart@...asonboard.com> wrote:
> >
> > On Fri, Jul 26, 2024 at 05:40:16PM +0200, Ricardo Ribalda Delgado wrote:
> > > On Fri, Jul 26, 2024 at 12:59 PM Laurent Pinchart wrote:
> > > > On Fri, Jul 26, 2024 at 10:04:33AM +0200, Ricardo Ribalda Delgado wrote:
> > > > > On Thu, Jul 25, 2024 at 10:02 PM Laurent Pinchart wrote:
> > > > > > On Mon, Jul 22, 2024 at 01:56:11PM +0200, Ricardo Ribalda Delgado wrote:
> > > > > > > On Mon, Jul 22, 2024 at 1:18 PM Laurent Pinchart wrote:
> > > > > > > > On Mon, Jul 22, 2024 at 12:42:52PM +0200, Ricardo Ribalda Delgado wrote:
> > > > > > > > > On Sun, Jul 21, 2024 at 9:25 PM Laurent Pinchart wrote:
> > > > > > > > > > On Tue, Jul 09, 2024 at 03:15:13PM -0700, Dan Williams wrote:
> > > > > > > > > > > James Bottomley wrote:
> > > > > > > > > > > > > The upstream discussion has yielded the full spectrum of positions on
> > > > > > > > > > > > > device specific functionality, and it is a topic that needs cross-
> > > > > > > > > > > > > kernel consensus as hardware increasingly spans cross-subsystem
> > > > > > > > > > > > > concerns. Please consider it for a Maintainers Summit discussion.
> > > > > > > > > > > >
> > > > > > > > > > > > I'm with Greg on this ... can you point to some of the contrary
> > > > > > > > > > > > positions?
> > > > > > > > > > >
> > > > > > > > > > > This thread has that discussion:
> > > > > > > > > > >
> > > > > > > > > > > http://lore.kernel.org/0-v1-9912f1a11620+2a-fwctl_jgg@nvidia.com
> > > > > > > > > > >
> > > > > > > > > > > I do not want to speak for others on the saliency of their points, all I
> > > > > > > > > > > can say is that the contrary positions have so far not moved me to drop
> > > > > > > > > > > consideration of fwctl for CXL.
> > > > > > > > > > >
> > > > > > > > > > > Where CXL has a Command Effects Log that is a reasonable protocol for
> > > > > > > > > > > making decisions about opaque command codes, and that CXL already has a
> > > > > > > > > > > few years of experience with the commands that *do* need a Linux-command
> > > > > > > > > > > wrapper.
> > > > > > > > > > >
> > > > > > > > > > > Some open questions from that thread are: what does it mean for the fate
> > > > > > > > > > > of a proposal if one subsystem Acks the ABI and another Naks it for a
> > > > > > > > > > > device that crosses subsystem functionality? Would a cynical hardware
> > > > > > > > > > > response just lead to plumbing an NVME admin queue, or CXL mailbox to
> > > > > > > > > > > get device-specific commands past another subsystem's objection?
> > > > > > > > > >
> > > > > > > > > > My default answer would be to trust the maintainers of the relevant
> > > > > > > > > > subsystems (or try to convince them when you disagree :-)). Not only
> > > > > > > > > > should they know the technical implications best, they should also have
> > > > > > > > > > a good view of the whole vertical stack, and the implications of
> > > > > > > > > > pass-through for their ecosystem. This may result in a single NAK
> > > > > > > > > > overriding ACKs, but we could also try to find technical solutions when
> > > > > > > > > > we'll face such issues, to enforce different sets of rules for the
> > > > > > > > > > different functions of a device.
> > > > > > > > > >
> > > > > > > > > > Subsystem hopping is something we're recently noticed for camera ISPs,
> > > > > > > > > > where a vendor wanted to move from V4L2 to DRM. Technical reasons for
> > > > > > > > > > doing so were given, and they were (in my opinion) rather excuses. The
> > > > > > > > > > unspoken real (again in my opinion) reason was to avoid documenting the
> > > > > > > > > > firmware interface and ship userspace binary blobs with no way for free
> > > > > > > > > > software to use all the device's features. That's something we have been
> > > > > > > > > > fighting against for years, trying to convince vendors that they can
> > > > > > > > > > provide better and more open camera support without the world
> > > > > > > > > > collapsing, with increasing success recently. Saying amen to
> > > > > > > > > > pass-through in this case would be a huge step back that would hurt
> > > > > > > > > > users and the whole ecosystem in the short and long term.
> > > > > > > > >
> > > > > > > > > In my view, DRM is a more suitable model for complex ISPs than V4L2:
> > > > > > > >
> > > > > > > > I know we disagree on this topic :-) I'm sure we'll continue the
> > > > > > > > conversation, but I think the technical discussion likely belongs to a
> > > > > > > > different mail thread.
> > > > > > > >
> > > > > > > > > - Userspace Complexity: ISPs demand a highly complex and evolving API,
> > > > > > > > > similar to Vulkan or OpenGL. Applications typically need a framework
> > > > > > > > > like libcamera to utilize ISPs effectively, much like Mesa for
> > > > > > > > > graphics cards.
> > > > > > > > >
> > > > > > > > > - Lack of Standardization: There's no universal standard for ISPs;
> > > > > > > > > each vendor implements unique features and usage patterns. DRM
> > > > > > > > > addresses this through vendor-specific IOCTLs
> > > > > > > > >
> > > > > > > > > - Proprietary Architectures: Vendors often don't fully disclose their
> > > > > > > > > hardware architectures. DRM cleverly only necessitates a Mesa
> > > > > > > > > implementation, not comprehensive documentation.
> > > > > > > >
> > > > > > > > This point isn't technical and is more on-topic for this mail thread.
> > > > > > > >
> > > > > > > > V4L2 doesn't require hundreds of pages of comprehensive documentation in
> > > > > > > > text form. An open-source userspace implementation that covers the
> > > > > > > > feature set exposed by the driver is acceptable in place of
> > > > > > > > documentation (provided, of course, that the userspace code wouldn't be
> > > > > > > > deliberately obfuscated). This is similar in spirit to the rule for GPU
> > > > > > > > DRM drivers.
> > > > > > >
> > > > > > > In DRM vendors typically define a custom IOCTL per driver to pass
> > > > > > > command buffers.
> > > > > > > Only the command buffer structure, and a mesa implementation using
> > > > > > > that command buffer to support the standard features is required.
> > > > > > >
> > > > > > > In V4l2 custom IOCTLs are discouraged. Random command buffers cannot
> > > > > > > be passed from userspace, they are typically formed in the driver from
> > > > > > > a strictly checked struct.
> > > > > >
> > > > > > V4L2 has a mechanism to pass buffers between userspace and kernelspace,
> > > > > > and that mechanism is used in mainline drivers to pass camera ISP
> > > > > > parameters. They're not called "command buffers" but that's just a
> > > > > > difference in terminology. The technical means to pass command buffers
> > > > > > to the driver is thus there, I see no meaningful difference with DRM.
> > > > > > Where things can differ is in the contents of those buffers, and the
> > > > > > requirements for documentation or open userspace implementations, but
> > > > > > that's not a technical question.
> > > > >
> > > > > There are two things here:
> > > > >
> > > > > - The political/strategic/philosophical/religious aspect: The industry
> > > > > definitely prefers the strategic requirements imposed by DRM. In fact
> > > > > some vendors had some huge legal troubles when they had tried to
> > > > > follow v4l2 requirements.
> > > >
> > > > That's I'm willing to debate.
> > > >
> > > > > - The technical aspect: DRM is more mature when it comes to
> > > > > sending/receiving buffers to the hardware, and an ISP looks *much*
> > > > > more similar to an accel device or a GPU than a UVC camera.
> > > >
> > > > But this I don't agree with. I think we should forgo the technical
> > > > discussion and stop pretending that DRM is better for this use case from
> > > > a technical point of view, and focus on the other aspect of the
> > > > discussion. (We can of course reopen the technical discussion if new
> > > > concrete arguments emerge.)
> > >
> > > I disagree. ISP devices are EXACTLY the same as accel devices.

...

> Most of the time, Camera ISPs are nothing more than DSPs plus a
> firmware with the computer vision algos.
> 
> If there is no framework to use a programmable ISP, vendors will keep
> putting everything on the firmware and exposing parameters.

ISPs, as the name suggests, tend to be purpose-built devices, even if some
support some degree of programmability. Even in those cases, the
programmable hardware often assumes pixels are being processed so they
can't be meaningfully used for general-purpose computation. The majority of
devices still has a hardware pipeline with no programmability.

This is also very different from GPUs or accel devices that are built to be
user-programmable. If I'd compare ISPs to different devices, then the
closest match would probably be video codecs -- which also use V4L2.

Also many ISPs use sensor input directly. That introduces timing
constraints that have implications on the API, too, as well as the use of
V4L2/MC APIs by those devices today. I'd rather not see device using an
entirely different API such as DRM in the same pipeline.

I believe we agree V4L2 isn't a great API for camera ISPs but then again I
don't think just switching to DRM is a solution that reasonably covers even
much of the problem area. Some ISPs could be fine using it as such from
purely technical point of view but there are be those for which DRM isn't
an option at all (mainly those having a sensor input).

So in my opinion we either need to improve V4L2/MC to better support those
devices or have an entirely new UAPI for them.

-- 
Kind regards,

Sakari Ailus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ