lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aKXnzqmz-_eR_bHF@google.com>
Date: Wed, 20 Aug 2025 15:20:46 +0000
From: Mostafa Saleh <smostafa@...gle.com>
To: Eric Auger <eric.auger@...hat.com>
Cc: Alex Williamson <alex.williamson@...hat.com>, kvm@...r.kernel.org,
	linux-kernel@...r.kernel.org, clg@...hat.com
Subject: Re: [PATCH 2/2] vfio/platform: Mark for removal

Hi Eric,

On Tue, Aug 19, 2025 at 11:58:32AM +0200, Eric Auger wrote:
> Hi Mostafa,
> 
> On 8/18/25 7:33 PM, Mostafa Saleh wrote:
> > On Mon, Aug 18, 2025 at 10:52:42AM -0600, Alex Williamson wrote:
> >> On Fri, 15 Aug 2025 16:59:37 +0000
> >> Mostafa Saleh <smostafa@...gle.com> wrote:
> >>
> >>> Hi Alex,
> >>>
> >>> On Wed, Aug 06, 2025 at 11:03:12AM -0600, Alex Williamson wrote:
> >>>> vfio-platform hasn't had a meaningful contribution in years.  In-tree
> >>>> hardware support is predominantly only for devices which are long since
> >>>> e-waste.  QEMU support for platform devices is slated for removal in
> >>>> QEMU-10.2.  Eric Auger presented on the future of the vfio-platform
> >>>> driver and difficulties supporting new devices at KVM Forum 2024,
> >>>> gaining some support for removal, some disagreement, but garnering no
> >>>> new hardware support, leaving the driver in a state where it cannot
> >>>> be tested.
> >>>>
> >>>> Mark as obsolete and subject to removal.  
> >>> Recently(this year) in Android, we enabled VFIO-platform for protected KVM,
> >>> and it’s supported in our VMM (CrosVM) [1].
> >>> CrosVM support is different from Qemu, as it doesn't require any device
> >>> specific logic in the VMM, however, it relies on loading a device tree
> >>> template in runtime (with “compatiable” string...) and it will just
> >>> override regs, irqs.. So it doesn’t need device knowledge (at least for now)
> >>> Similarly, the kernel doesn’t need reset drivers as the hypervisor handles that.
> >> I think what we attempt to achieve in vfio is repeatability and data
> >> integrity independent of the hypervisor.  IOW, if we 'kill -9' the
> >> hypervisor process, the kernel can bring the device back to a default
> >> state where the device isn't wedged or leaking information through the
> >> device to the next use case.  If the hypervisor wants to support
> >> enhanced resets on top of that, that's great, but I think it becomes
> >> difficult to argue that vfio-platform itself holds up its end of the
> >> bargain if we're really trusting the hypervisor to handle these aspects.
> > Sorry I was not clear, we only use that in Android for ARM64 and pKVM,
> > where the hypervisor in this context means the code running in EL2 which
> > is more privileged than the kernel, so it should be trusted.
> > However, as I mentioned that code is not upstream yet, so it's a valid
> > concern that the kernel still needs a reset driver.
> >
> >>> Unfortunately, there is no upstream support at the moment, we are making
> >>> some -slow- progress on that [2][3]
> >>>
> >>> If it helps, I have access to HW that can run that and I can review/test
> >>> changes, until upstream support lands; if you are open to keeping VFIO-platform.
> >>> Or I can look into adding support for existing upstream HW(with platforms I am
> >>> familiar with as Pixel-6)
> >> Ultimately I'll lean on Eric to make the call.  I know he's concerned
> >> about testing, but he raised that and various other concerns whether
> >> platform device really have a future with vfio nearly a year ago and
> >> nothing has changed.  Currently it requires a module option opt-in to
> >> enable devices that the kernel doesn't know how to reset.  Is that
> >> sufficient or should use of such a device taint the kernel?  If any
> >> device beyond the few e-waste devices that we know how to reset taint
> >> the kernel, should this support really even be in the kernel?  Thanks,
> > I think with the way it’s supported at the moment we need the kernel
> > to ensure that reset happens.
> 
> Effectively my main concern is I cannot test vfio-platform anymore. We
> had some CVEs also impacting the vfio platform code base and it is a
> major issue not being able to test. That's why I was obliged, last year,
> to resume the integration of a new device (the tegra234 mgbe), nobody
> seemed to be really interested in and this work could not be upstreamed
> due to lack of traction and its hacky nature.
> 
> You did not really comment on which kind of devices were currently
> integrated. Are they within the original scope of vfio (with DMA
> capabilities and protected by an IOMMU)? Last discussion we had in
> https://lore.kernel.org/all/ZvvLpLUZnj-Z_tEs@google.com/ led to the
> conclusion that maybe VFIO was not the best suited framework.

At the moment, Android device assignement only supports DMA capable
devices which are behind an IOMMU, and we use VFIO-platform for that,
most of our use cases are accelerators.

In that thread, I was looking into adding support for simpler devices
(such as sensors) but as discussed that won’t be done through
VFIO-platform.

Ignoring Android, as I mentioned, I can work on adding support for
existing upstream platforms (preferably ARM64, that I can get access to)
such as Pixel-6, which should make it easier to test.

Also, we have some interest on adding new features such as run-time
power management.

>
> In case we keep the driver in, I think we need to get a garantee that
> you or someone else at Google commits to review and test potential
> changes with a perspective to take over its maintenance.

I can’t make guarantees on behalf of Google, but I can contribute in
reviewing/testing/maintenance of the driver as far as I am able to.
If you want, you can add me as reviewer to the driver.

Thanks,
Mostafa


> 
> Thanks
> 
> Eric
> 
> >
> > But maybe instead of having that specific reset handler for VFIO, we
> > can rely on the “shutdown” method already existing in "platform_driver"?
> > I believe that should put the device in a state where it can be re-probed
> > safely. Although not all devices implement that but it seems more generic
> > and scalable.
> >
> > Thanks,
> > Mostafa
> >
> >> Alex
> >>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ