lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPybu_28xdSSz43wd+yY25_0SbGUBPOpgbKhknUoibbNiQBQfg@mail.gmail.com>
Date: Mon, 29 Jul 2024 11:57:25 +0200
From: Ricardo Ribalda Delgado <ricardo.ribalda@...il.com>
To: Laurent Pinchart <laurent.pinchart@...asonboard.com>
Cc: Jason Gunthorpe <jgg@...dia.com>, James Bottomley <James.Bottomley@...senpartnership.com>, 
	Jiri Kosina <jikos@...nel.org>, Dan Williams <dan.j.williams@...el.com>, ksummit@...ts.linux.dev, 
	linux-cxl@...r.kernel.org, linux-rdma@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [MAINTAINERS SUMMIT] Device Passthrough Considered Harmful?

On Sun, Jul 28, 2024 at 5:25 PM Laurent Pinchart
<laurent.pinchart@...asonboard.com> wrote:
>
> On Fri, Jul 26, 2024 at 05:43:21PM +0200, Ricardo Ribalda Delgado wrote:
> > On Fri, Jul 26, 2024 at 4:22 PM Laurent Pinchart wrote:
> > > On Fri, Jul 26, 2024 at 10:11:06AM -0300, Jason Gunthorpe wrote:
> > > > On Fri, Jul 26, 2024 at 03:49:49PM +0300, Laurent Pinchart wrote:
> > > >
> > > > > What is not an option exactly in my description above ? We have multiple
> > > > > V4L2 drivers for ISPs. They receive ISP parameters from userspace
> > > > > through a data buffer. It's not allowed to be opaque, but it doesn't
> > > > > prevent vendor closed-source userspace implementations with additional
> > > > > *camera* features, as long as the *hardware* features are available to
> > > > > everybody.
> > > >
> > > > How far do you take opaque?
> > > >
> > > > In mlx5 we pass command buffers from user to kernel to HW and the
> > > > kernel does only a little checking.
> > >
> > > I won't comment on the mlx5 case in particular, I don't have enough
> > > knowledge to do so.
> > >
> > > When it comes to validation of commands by the kernel, at the very least
> > > the kernel driver needs to be able to guarantee safety. For instance,
> > > that means that any command that can result in DMA operations need to be
> > > validated (e.g. verifying buffer addresses and sizes) and/or executed
> > > partly by the driver (e.g. mapping a buffer through an IOMMU), depending
> > > on hardware constraints.
> > >
> > > > There is a 12kloc file describing the layout of alot of commands:
> > > > include/linux/mlx5/mlx5_ifc.h
> > > >
> > > > There is an open PDF describing in detail some subset of this:
> > > > https://network.nvidia.com/files/doc-2020/ethernet-adapters-programming-manual.pdf
> > > >
> > > > There are in-kernel implementations driving most of those commands.
> > >
> > > For camera ISPs, most of the "commands" wouldn't be driven by the
> > > kernel. I don't have expectations when it comes to what commands the
> > > kernel exercises directly, I think that highly depends on the device
> > > type.
> > >
> > > > Other commands are only issued by userspace, and we have open source
> > > > DPDK, rdma-core and UCX implementations driving them.
> > > >
> > > > ie, this is really quite good as far as a device providing open source
> > > > solutions goes.
> > > >
> > > > However, no doubt there is more FW capability and commands than even
> > > > this vast amount documents - so lets guess that propritary code is
> > > > using this interface with unknown commands too.
> > > >
> > > > From a camera perspective would you be OK with that? Let's guess that
> > > > 90% of use cases are covered with fully open code. Do we also need to
> > > > forcefully close the door to an imagined 10% of proprietary cases just
> > > > to be sure open always wins?
> > >
> > > For command buffers interpreted by a firmware, it would be extremely
> > > difficult, if not impossible, to ensure that nothing undocumented is
> > > possible to access from userspace. I think we have two issues here, one
> > > of them is to define what we required to be openly accessible, and the
> > > other to avoid vendors cheating by claiming to be more open than they
> > > actually are.
> > >
> > > I believe the latter can't be solved technically. At the end of the day
> > > it's a matter of trust, and the only option I can see is to build that
> > > trust with the vendors, and to make it clear that breaches of trust will
> > > have consequences. A vendor that deliberately lies would end up on my
> > > personal backlist for as long as they don't implement structural
> > > solutions to be a better player in the ecosystem. What is acceptable as
> > > consequences is for the communities to decide. We have previously banned
> > > people or organizations from contributing to the kernel for certain
> > > periods of time (the University of Minnesota case comes to my mind for
> > > instance), and other options can be considered too.
> > >
> > > As for the first issue, I think it's a difficult one as it is very
> > > difficult to quantify the features covered by open implementations, as
> > > well as their importance. You could have a 99% open command set where
> > > the 1% would impact open implementations extremely negatively, the same
> > > way you could have a 50% open command set where the other 50% isn't of
> > > any use to anyone but the vendor (for instance used to debug the
> > > firmware implementation).
> >
> > It is not that difficult. You just have to define what an acceptable
> > implementation is. Eg
> >
> > - Sharpness at specific light.
> > - Time of convergence for the Auto algorithms
> > - Noise ratio
>
> Those may just be bad examples, but I think they showcase how little
> these discussions are based on technical expertise and facts. Just

Could you perhaps provide better examples of what are good objective
characteristics of an open implementation. ?

Those characteristics have been directly extracted from actual OS
certification requirements for Camera stacks.

> singling out one example, the convergence time is driven by the
> userspace implementation of the ISP control algorithms. That is *not*
> something we want to force vendors to disclose. It isn't related to the
> ISP parameters and the UAPI the driver exposes to userspace.

I am not talking about the driver UAPI. I am saying that the Open
implementation provided by vendors to land their code must not be a
"toy" implementation, but have good characteristics.

You said yourself that with the "new rules" we also demand a reference
open ISP control algorithm.

Would you be OK with an open algorithm that cannot converge in a
reasonable time? I hope not.


>
> > We could even use the SoftISP implementation as reference. Is this ISP
> > working better or worse than SoftISP?
> >
> > > For cameras, the feature set can I believe be expressed in terms of ISP
> > > processing blocks that are usable by open implementations, as well as in
> > > terms of the type of usage of those features. For instance, is it
> > > acceptable that a vendor documents how to use 2D noise reduction but
> > > makes 3D (temporal) noise reduction available only to close-source
> > > userspace ? My personal answer is no. I want to aim for very close to
> > > 100% of the features, and certainly 100% of the "large" features. I can
> > > close my eyes on features that are very very niche, but what is niche
> > > today may not be tomorrow. For instance, if a camera ISP implements a
> > > feature used only for very specific camera sensors targetted at
> > > autonomous driving in a new generation of research prototypes that only
> > > one company has access to, I'll be inclined to ignore that. That
> > > technology may however become much more mainstream 5 years later.
> >
> > We can update our requirements in 5 years. Nothing is written in stone.
>
> That at least we agree on :-)
>
> > Also it is much easier to reverse engineer an open driver, with an
> > open userspace and a closed userspace than a close driver with a
> > closed userspace.
>
> Closed kernel drivers are a GPL violation. Do you have any example, or
> do you mean out-of-tree drivers ? I don't think out-of-tree vs. mainline

I am not a lawyer, so I am not capable of discussing violations of any License.

I mean close and out-of-tree. Just two name two well known drivers
with a binary blob:
- fgrlx
- nvidia proprietary.

> drivers would make much difference when it comes to reverse engineering,
> if the mainline driver is just a pass-through driver.

If you have an usable open source camera stack, reverse engineering
the userspace closed implementation should not be a big challenge.

>
> > > The other aspect is the type of usage of the features. For camera ISPs
> > > again, some parameters will be computed through a tuning process, and
> > > will be fixed for the lifetime of the camera. I want those parameters to
> > > be documented, to enable users to tune cameras for their own use cases.
> > > This is less important in some market segments, such as laptops for
> > > instance, but is crucial for embedded devices. This is an area where
> > > I've previously been called unreasonable, and I don't think that's fair.
> > > The tuned-and-immutable parameters are not plentiful, as most of the
> > > tuning results in data that needs to be combined with runtime
> > > information to compute ISP parameters, so the "this is for tuning only"
> > > argument doesn't hold as much as one may think. For real immutable
> > > parameters, a large number of them are related to image processing steps
> > > that are very common and found in most ISPs, such as lens shading
> > > correction or geometric distorsion correction. I don't see why we should
> > > let vendors keep those closed.
> >
> > We don't have enough leverage for that kind of requirement.
>
> I very much disagree with that.
>
> > To be fair, we do not ask touchscreen manufacturers to document their
> > calibration files. Nor any other calibration file in the kernel.
> >
> > The calibration file for me is like a firmware blob.
>
> It may be for you, but I don't think it is an accurate description for
> the rest of the industry.

Can you define who the industry is here?

In my job I have visibility on who and how they create that
calibration file, and the release cycle is identical to the firmware.

>
> > > I'm sorry that this discussion is turning into something very
> > > camera-centric, but that's the relevant area I know best. I hope that
> > > discussing problems related to device pass-through in different areas in
> > > the open will help build a wider shared understanding of the problems,
> > > and hopefully help designing solutions by better taking into account the
> > > various aspects of the issues.
> > >
> > > > Does closing the door have to come at the cost of a technically clean
> > > > solution? Doing validation in the kernel to enforce an ideological
> > > > position would severely degrade mlx5, and would probably never really
> > > > be robust.
>
> --
> Regards,
>
> Laurent Pinchart



-- 
Ricardo Ribalda

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ