[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <MWHPR11MB1886498515951BCE98F9336A8C699@MWHPR11MB1886.namprd11.prod.outlook.com>
Date: Thu, 18 Mar 2021 12:32:49 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Shenming Lu <lushenming@...wei.com>,
Alex Williamson <alex.williamson@...hat.com>
CC: Cornelia Huck <cohuck@...hat.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Jean-Philippe Brucker <jean-philippe@...aro.org>,
Eric Auger <eric.auger@...hat.com>,
Lu Baolu <baolu.lu@...ux.intel.com>,
"wanghaibin.wang@...wei.com" <wanghaibin.wang@...wei.com>,
"yuzenghui@...wei.com" <yuzenghui@...wei.com>,
"Liu, Yi L" <yi.l.liu@...el.com>,
"Pan, Jacob jun" <jacob.jun.pan@...el.com>
Subject: RE: [RFC PATCH v1 0/4] vfio: Add IOPF support for VFIO passthrough
> From: Shenming Lu <lushenming@...wei.com>
> Sent: Thursday, March 18, 2021 7:54 PM
>
> On 2021/3/18 17:07, Tian, Kevin wrote:
> >> From: Shenming Lu <lushenming@...wei.com>
> >> Sent: Thursday, March 18, 2021 3:53 PM
> >>
> >> On 2021/2/4 14:52, Tian, Kevin wrote:>>> In reality, many
> >>>>> devices allow I/O faulting only in selective contexts. However, there
> >>>>> is no standard way (e.g. PCISIG) for the device to report whether
> >>>>> arbitrary I/O fault is allowed. Then we may have to maintain device
> >>>>> specific knowledge in software, e.g. in an opt-in table to list devices
> >>>>> which allows arbitrary faults. For devices which only support selective
> >>>>> faulting, a mediator (either through vendor extensions on vfio-pci-core
> >>>>> or a mdev wrapper) might be necessary to help lock down non-
> faultable
> >>>>> mappings and then enable faulting on the rest mappings.
> >>>>
> >>>> For devices which only support selective faulting, they could tell it to the
> >>>> IOMMU driver and let it filter out non-faultable faults? Do I get it wrong?
> >>>
> >>> Not exactly to IOMMU driver. There is already a vfio_pin_pages() for
> >>> selectively page-pinning. The matter is that 'they' imply some device
> >>> specific logic to decide which pages must be pinned and such knowledge
> >>> is outside of VFIO.
> >>>
> >>> From enabling p.o.v we could possibly do it in phased approach. First
> >>> handles devices which tolerate arbitrary DMA faults, and then extends
> >>> to devices with selective-faulting. The former is simpler, but with one
> >>> main open whether we want to maintain such device IDs in a static
> >>> table in VFIO or rely on some hints from other components (e.g. PF
> >>> driver in VF assignment case). Let's see how Alex thinks about it.
> >>
> >> Hi Kevin,
> >>
> >> You mentioned selective-faulting some time ago. I still have some doubt
> >> about it:
> >> There is already a vfio_pin_pages() which is used for limiting the IOMMU
> >> group dirty scope to pinned pages, could it also be used for indicating
> >> the faultable scope is limited to the pinned pages and the rest mappings
> >> is non-faultable that should be pinned and mapped immediately? But it
> >> seems to be a little weird and not exactly to what you meant... I will
> >> be grateful if you can help to explain further. :-)
> >>
> >
> > The opposite, i.e. the vendor driver uses vfio_pin_pages to lock down
> > pages that are not faultable (based on its specific knowledge) and then
> > the rest memory becomes faultable.
>
> Ahh...
> Thus, from the perspective of VFIO IOMMU, if IOPF enabled for such device,
> only the page faults within the pinned range are valid in the registered
> iommu fault handler...
> I have another question here, for the IOMMU backed devices, they are
> already
> all pinned and mapped when attaching, is there a need to call
> vfio_pin_pages()
> to lock down pages for them? Did I miss something?...
>
If a device is marked as supporting I/O page fault (fully or selectively),
there should be no pinning at attach or DMA_MAP time (suppose as
this series does). Then for devices with selective-faulting its vendor
driver will lock down the pages which are not faultable at run-time,
e.g. when intercepting guest registration of a ring buffer...
Thanks
Kevin
Powered by blists - more mailing lists