lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <00001486-b43d-4c2b-a41c-35ab5e823f21@redhat.com>
Date: Tue, 19 Aug 2025 11:58:32 +0200
From: Eric Auger <eric.auger@...hat.com>
To: Mostafa Saleh <smostafa@...gle.com>,
 Alex Williamson <alex.williamson@...hat.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org, clg@...hat.com
Subject: Re: [PATCH 2/2] vfio/platform: Mark for removal

Hi Mostafa,

On 8/18/25 7:33 PM, Mostafa Saleh wrote:
> On Mon, Aug 18, 2025 at 10:52:42AM -0600, Alex Williamson wrote:
>> On Fri, 15 Aug 2025 16:59:37 +0000
>> Mostafa Saleh <smostafa@...gle.com> wrote:
>>
>>> Hi Alex,
>>>
>>> On Wed, Aug 06, 2025 at 11:03:12AM -0600, Alex Williamson wrote:
>>>> vfio-platform hasn't had a meaningful contribution in years.  In-tree
>>>> hardware support is predominantly only for devices which are long since
>>>> e-waste.  QEMU support for platform devices is slated for removal in
>>>> QEMU-10.2.  Eric Auger presented on the future of the vfio-platform
>>>> driver and difficulties supporting new devices at KVM Forum 2024,
>>>> gaining some support for removal, some disagreement, but garnering no
>>>> new hardware support, leaving the driver in a state where it cannot
>>>> be tested.
>>>>
>>>> Mark as obsolete and subject to removal.  
>>> Recently(this year) in Android, we enabled VFIO-platform for protected KVM,
>>> and it’s supported in our VMM (CrosVM) [1].
>>> CrosVM support is different from Qemu, as it doesn't require any device
>>> specific logic in the VMM, however, it relies on loading a device tree
>>> template in runtime (with “compatiable” string...) and it will just
>>> override regs, irqs.. So it doesn’t need device knowledge (at least for now)
>>> Similarly, the kernel doesn’t need reset drivers as the hypervisor handles that.
>> I think what we attempt to achieve in vfio is repeatability and data
>> integrity independent of the hypervisor.  IOW, if we 'kill -9' the
>> hypervisor process, the kernel can bring the device back to a default
>> state where the device isn't wedged or leaking information through the
>> device to the next use case.  If the hypervisor wants to support
>> enhanced resets on top of that, that's great, but I think it becomes
>> difficult to argue that vfio-platform itself holds up its end of the
>> bargain if we're really trusting the hypervisor to handle these aspects.
> Sorry I was not clear, we only use that in Android for ARM64 and pKVM,
> where the hypervisor in this context means the code running in EL2 which
> is more privileged than the kernel, so it should be trusted.
> However, as I mentioned that code is not upstream yet, so it's a valid
> concern that the kernel still needs a reset driver.
>
>>> Unfortunately, there is no upstream support at the moment, we are making
>>> some -slow- progress on that [2][3]
>>>
>>> If it helps, I have access to HW that can run that and I can review/test
>>> changes, until upstream support lands; if you are open to keeping VFIO-platform.
>>> Or I can look into adding support for existing upstream HW(with platforms I am
>>> familiar with as Pixel-6)
>> Ultimately I'll lean on Eric to make the call.  I know he's concerned
>> about testing, but he raised that and various other concerns whether
>> platform device really have a future with vfio nearly a year ago and
>> nothing has changed.  Currently it requires a module option opt-in to
>> enable devices that the kernel doesn't know how to reset.  Is that
>> sufficient or should use of such a device taint the kernel?  If any
>> device beyond the few e-waste devices that we know how to reset taint
>> the kernel, should this support really even be in the kernel?  Thanks,
> I think with the way it’s supported at the moment we need the kernel
> to ensure that reset happens.

Effectively my main concern is I cannot test vfio-platform anymore. We
had some CVEs also impacting the vfio platform code base and it is a
major issue not being able to test. That's why I was obliged, last year,
to resume the integration of a new device (the tegra234 mgbe), nobody
seemed to be really interested in and this work could not be upstreamed
due to lack of traction and its hacky nature.

You did not really comment on which kind of devices were currently
integrated. Are they within the original scope of vfio (with DMA
capabilities and protected by an IOMMU)? Last discussion we had in
https://lore.kernel.org/all/ZvvLpLUZnj-Z_tEs@google.com/ led to the
conclusion that maybe VFIO was not the best suited framework.

In case we keep the driver in, I think we need to get a garantee that
you or someone else at Google commits to review and test potential
changes with a perspective to take over its maintenance.

Thanks

Eric

>
> But maybe instead of having that specific reset handler for VFIO, we
> can rely on the “shutdown” method already existing in "platform_driver"?
> I believe that should put the device in a state where it can be re-probed
> safely. Although not all devices implement that but it seems more generic
> and scalable.
>
> Thanks,
> Mostafa
>
>> Alex
>>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ