linux-kernel - Re: [RFC PATCH v1 0/4] vfio: Add IOPF support for VFIO passthrough

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <44a8b643-6920-b2b5-a593-2942b5ea4ee7@huawei.com>
Date:   Sat, 30 Jan 2021 17:30:58 +0800
From:   Shenming Lu <lushenming@...wei.com>
To:     Alex Williamson <alex.williamson@...hat.com>
CC:     Cornelia Huck <cohuck@...hat.com>, <kvm@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>,
        Jean-Philippe Brucker <jean-philippe@...aro.org>,
        Eric Auger <eric.auger@...hat.com>,
        Lu Baolu <baolu.lu@...ux.intel.com>,
        Kevin Tian <kevin.tian@...el.com>,
        <wanghaibin.wang@...wei.com>, <yuzenghui@...wei.com>
Subject: Re: [RFC PATCH v1 0/4] vfio: Add IOPF support for VFIO passthrough

On 2021/1/30 6:57, Alex Williamson wrote:
> On Mon, 25 Jan 2021 17:03:58 +0800
> Shenming Lu <lushenming@...wei.com> wrote:
> 
>> Hi,
>>
>> The static pinning and mapping problem in VFIO and possible solutions
>> have been discussed a lot [1, 2]. One of the solutions is to add I/O
>> page fault support for VFIO devices. Different from those relatively
>> complicated software approaches such as presenting a vIOMMU that provides
>> the DMA buffer information (might include para-virtualized optimizations),
>> IOPF mainly depends on the hardware faulting capability, such as the PCIe
>> PRI extension or Arm SMMU stall model. What's more, the IOPF support in
>> the IOMMU driver is being implemented in SVA [3]. So do we consider to
>> add IOPF support for VFIO passthrough based on the IOPF part of SVA at
>> present?
>>
>> We have implemented a basic demo only for one stage of translation (GPA
>> -> HPA in virtualization, note that it can be configured at either stage),  
>> and tested on Hisilicon Kunpeng920 board. The nested mode is more complicated
>> since VFIO only handles the second stage page faults (same as the non-nested
>> case), while the first stage page faults need to be further delivered to
>> the guest, which is being implemented in [4] on ARM. My thought on this
>> is to report the page faults to VFIO regardless of the occured stage (try
>> to carry the stage information), and handle respectively according to the
>> configured mode in VFIO. Or the IOMMU driver might evolve to support more...
>>
>> Might TODO:
>>  - Optimize the faulting path, and measure the performance (it might still
>>    be a big issue).
>>  - Add support for PRI.
>>  - Add a MMU notifier to avoid pinning.
>>  - Add support for the nested mode.
>> ...
>>
>> Any comments and suggestions are very welcome. :-)
> 
> I expect performance to be pretty bad here, the lookup involved per
> fault is excessive.

We might consider to prepin more pages as a further optimization.

> There are cases where a user is not going to be
> willing to have a slow ramp up of performance for their devices as they
> fault in pages, so we might need to considering making this
> configurable through the vfio interface.

Yeah, makes sense, I will try to implement this: maybe add a ioctl called
VFIO_IOMMU_ENABLE_IOPF for Type1 VFIO IOMMU...

> Our page mapping also only
> grows here, should mappings expire or do we need a least recently
> mapped tracker to avoid exceeding the user's locked memory limit?  How
> does a user know what to set for a locked memory limit?

Yeah, we can add a LRU(mapped) tracker to release the pages when exceeding
a memory limit, maybe have a thread to periodically check this.
And as for the memory limit, maybe we could give the user some levels
(10%(default)/30%/50%/70%/unlimited of the total user memory (mapping size))
to choose from via the VFIO_IOMMU_ENABLE_IOPF ioctl...

> The behavior
> here would lead to cases where an idle system might be ok, but as soon
> as load increases with more inflight DMA, we start seeing
> "unpredictable" I/O faults from the user perspective.

"unpredictable" I/O faults? We might see more problems after more testing...

Thanks,
Shenming

> Seems like there
> are lots of outstanding considerations and I'd also like to hear from
> the SVA folks about how this meshes with their work.  Thanks,
> 
> Alex
> 
> .
>