[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPybu_3+J-e-s2+3KV51Nc4Z1rys98xnN6p_0xXkdag-no09JQ@mail.gmail.com>
Date: Fri, 26 Jul 2024 17:43:21 +0200
From: Ricardo Ribalda Delgado <ricardo.ribalda@...il.com>
To: Laurent Pinchart <laurent.pinchart@...asonboard.com>
Cc: Jason Gunthorpe <jgg@...dia.com>, James Bottomley <James.Bottomley@...senpartnership.com>,
Jiri Kosina <jikos@...nel.org>, Dan Williams <dan.j.williams@...el.com>, ksummit@...ts.linux.dev,
linux-cxl@...r.kernel.org, linux-rdma@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [MAINTAINERS SUMMIT] Device Passthrough Considered Harmful?
On Fri, Jul 26, 2024 at 4:22 PM Laurent Pinchart
<laurent.pinchart@...asonboard.com> wrote:
>
> On Fri, Jul 26, 2024 at 10:11:06AM -0300, Jason Gunthorpe wrote:
> > On Fri, Jul 26, 2024 at 03:49:49PM +0300, Laurent Pinchart wrote:
> >
> > > What is not an option exactly in my description above ? We have multiple
> > > V4L2 drivers for ISPs. They receive ISP parameters from userspace
> > > through a data buffer. It's not allowed to be opaque, but it doesn't
> > > prevent vendor closed-source userspace implementations with additional
> > > *camera* features, as long as the *hardware* features are available to
> > > everybody.
> >
> > How far do you take opaque?
> >
> > In mlx5 we pass command buffers from user to kernel to HW and the
> > kernel does only a little checking.
>
> I won't comment on the mlx5 case in particular, I don't have enough
> knowledge to do so.
>
> When it comes to validation of commands by the kernel, at the very least
> the kernel driver needs to be able to guarantee safety. For instance,
> that means that any command that can result in DMA operations need to be
> validated (e.g. verifying buffer addresses and sizes) and/or executed
> partly by the driver (e.g. mapping a buffer through an IOMMU), depending
> on hardware constraints.
>
> > There is a 12kloc file describing the layout of alot of commands:
> > include/linux/mlx5/mlx5_ifc.h
> >
> > There is an open PDF describing in detail some subset of this:
> > https://network.nvidia.com/files/doc-2020/ethernet-adapters-programming-manual.pdf
> >
> > There are in-kernel implementations driving most of those commands.
>
> For camera ISPs, most of the "commands" wouldn't be driven by the
> kernel. I don't have expectations when it comes to what commands the
> kernel exercises directly, I think that highly depends on the device
> type.
>
> > Other commands are only issued by userspace, and we have open source
> > DPDK, rdma-core and UCX implementations driving them.
> >
> > ie, this is really quite good as far as a device providing open source
> > solutions goes.
> >
> > However, no doubt there is more FW capability and commands than even
> > this vast amount documents - so lets guess that propritary code is
> > using this interface with unknown commands too.
> >
> > From a camera perspective would you be OK with that? Let's guess that
> > 90% of use cases are covered with fully open code. Do we also need to
> > forcefully close the door to an imagined 10% of proprietary cases just
> > to be sure open always wins?
>
> For command buffers interpreted by a firmware, it would be extremely
> difficult, if not impossible, to ensure that nothing undocumented is
> possible to access from userspace. I think we have two issues here, one
> of them is to define what we required to be openly accessible, and the
> other to avoid vendors cheating by claiming to be more open than they
> actually are.
>
> I believe the latter can't be solved technically. At the end of the day
> it's a matter of trust, and the only option I can see is to build that
> trust with the vendors, and to make it clear that breaches of trust will
> have consequences. A vendor that deliberately lies would end up on my
> personal backlist for as long as they don't implement structural
> solutions to be a better player in the ecosystem. What is acceptable as
> consequences is for the communities to decide. We have previously banned
> people or organizations from contributing to the kernel for certain
> periods of time (the University of Minnesota case comes to my mind for
> instance), and other options can be considered too.
>
> As for the first issue, I think it's a difficult one as it is very
> difficult to quantify the features covered by open implementations, as
> well as their importance. You could have a 99% open command set where
> the 1% would impact open implementations extremely negatively, the same
> way you could have a 50% open command set where the other 50% isn't of
> any use to anyone but the vendor (for instance used to debug the
> firmware implementation).
It is not that difficult. You just have to define what an acceptable
implementation is. Eg
- Sharpness at specific light.
- Time of convergence for the Auto algorithms
- Noise ratio
We could even use the SoftISP implementation as reference. Is this ISP
working better or worse than SoftISP?
>
> For cameras, the feature set can I believe be expressed in terms of ISP
> processing blocks that are usable by open implementations, as well as in
> terms of the type of usage of those features. For instance, is it
> acceptable that a vendor documents how to use 2D noise reduction but
> makes 3D (temporal) noise reduction available only to close-source
> userspace ? My personal answer is no. I want to aim for very close to
> 100% of the features, and certainly 100% of the "large" features. I can
> close my eyes on features that are very very niche, but what is niche
> today may not be tomorrow. For instance, if a camera ISP implements a
> feature used only for very specific camera sensors targetted at
> autonomous driving in a new generation of research prototypes that only
> one company has access to, I'll be inclined to ignore that. That
> technology may however become much more mainstream 5 years later.
We can update our requirements in 5 years. Nothing is written in stone.
Also it is much easier to reverse engineer an open driver, with an
open userspace and a closed userspace than a close driver with a
closed userspace.
>
> The other aspect is the type of usage of the features. For camera ISPs
> again, some parameters will be computed through a tuning process, and
> will be fixed for the lifetime of the camera. I want those parameters to
> be documented, to enable users to tune cameras for their own use cases.
> This is less important in some market segments, such as laptops for
> instance, but is crucial for embedded devices. This is an area where
> I've previously been called unreasonable, and I don't think that's fair.
> The tuned-and-immutable parameters are not plentiful, as most of the
> tuning results in data that needs to be combined with runtime
> information to compute ISP parameters, so the "this is for tuning only"
> argument doesn't hold as much as one may think. For real immutable
> parameters, a large number of them are related to image processing steps
> that are very common and found in most ISPs, such as lens shading
> correction or geometric distorsion correction. I don't see why we should
> let vendors keep those closed.
We don't have enough leverage for that kind of requirement.
To be fair, we do not ask touchscreen manufacturers to document their
calibration files. Nor any other calibration file in the kernel.
The calibration file for me is like a firmware blob.
>
> I'm sorry that this discussion is turning into something very
> camera-centric, but that's the relevant area I know best. I hope that
> discussing problems related to device pass-through in different areas in
> the open will help build a wider shared understanding of the problems,
> and hopefully help designing solutions by better taking into account the
> various aspects of the issues.
>
> > Does closing the door have to come at the cost of a technically clean
> > solution? Doing validation in the kernel to enforce an ideological
> > position would severely degrade mlx5, and would probably never really
> > be robust.
>
> --
> Regards,
>
> Laurent Pinchart
--
Ricardo Ribalda
Powered by blists - more mailing lists