[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YLhupAfUWWiVMDVU@yekko>
Date: Thu, 3 Jun 2021 15:54:44 +1000
From: David Gibson <david@...son.dropbear.id.au>
To: Lu Baolu <baolu.lu@...ux.intel.com>
Cc: Jason Gunthorpe <jgg@...dia.com>,
"Tian, Kevin" <kevin.tian@...el.com>,
LKML <linux-kernel@...r.kernel.org>,
Joerg Roedel <joro@...tes.org>,
David Woodhouse <dwmw2@...radead.org>,
"iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"Alex Williamson (alex.williamson@...hat.com)"
<alex.williamson@...hat.com>, Jason Wang <jasowang@...hat.com>,
Eric Auger <eric.auger@...hat.com>,
Jonathan Corbet <corbet@....net>,
"Raj, Ashok" <ashok.raj@...el.com>,
"Liu, Yi L" <yi.l.liu@...el.com>, "Wu, Hao" <hao.wu@...el.com>,
"Jiang, Dave" <dave.jiang@...el.com>,
Jacob Pan <jacob.jun.pan@...ux.intel.com>,
Jean-Philippe Brucker <jean-philippe@...aro.org>,
Kirti Wankhede <kwankhede@...dia.com>,
Robin Murphy <robin.murphy@....com>
Subject: Re: [RFC] /dev/ioasid uAPI proposal
On Tue, Jun 01, 2021 at 07:09:21PM +0800, Lu Baolu wrote:
> Hi Jason,
>
> On 2021/5/29 7:36, Jason Gunthorpe wrote:
> > > /*
> > > * Bind an user-managed I/O page table with the IOMMU
> > > *
> > > * Because user page table is untrusted, IOASID nesting must be enabled
> > > * for this ioasid so the kernel can enforce its DMA isolation policy
> > > * through the parent ioasid.
> > > *
> > > * Pgtable binding protocol is different from DMA mapping. The latter
> > > * has the I/O page table constructed by the kernel and updated
> > > * according to user MAP/UNMAP commands. With pgtable binding the
> > > * whole page table is created and updated by userspace, thus different
> > > * set of commands are required (bind, iotlb invalidation, page fault, etc.).
> > > *
> > > * Because the page table is directly walked by the IOMMU, the user
> > > * must use a format compatible to the underlying hardware. It can
> > > * check the format information through IOASID_GET_INFO.
> > > *
> > > * The page table is bound to the IOMMU according to the routing
> > > * information of each attached device under the specified IOASID. The
> > > * routing information (RID and optional PASID) is registered when a
> > > * device is attached to this IOASID through VFIO uAPI.
> > > *
> > > * Input parameters:
> > > * - child_ioasid;
> > > * - address of the user page table;
> > > * - formats (vendor, address_width, etc.);
> > > *
> > > * Return: 0 on success, -errno on failure.
> > > */
> > > #define IOASID_BIND_PGTABLE _IO(IOASID_TYPE, IOASID_BASE + 9)
> > > #define IOASID_UNBIND_PGTABLE _IO(IOASID_TYPE, IOASID_BASE + 10)
> > Also feels backwards, why wouldn't we specify this, and the required
> > page table format, during alloc time?
> >
>
> Thinking of the required page table format, perhaps we should shed more
> light on the page table of an IOASID. So far, an IOASID might represent
> one of the following page tables (might be more):
>
> 1) an IOMMU format page table (a.k.a. iommu_domain)
> 2) a user application CPU page table (SVA for example)
> 3) a KVM EPT (future option)
> 4) a VM guest managed page table (nesting mode)
>
> This version only covers 1) and 4). Do you think we need to support 2),
Isn't (2) the equivalent of using the using the host-managed pagetable
then doing a giant MAP of all your user address space into it? But
maybe we should identify that case explicitly in case the host can
optimize it.
> 3) and beyond? If so, it seems that we need some in-kernel helpers and
> uAPIs to support pre-installing a page table to IOASID. From this point
> of view an IOASID is actually not just a variant of iommu_domain, but an
> I/O page table representation in a broader sense.
>
> Best regards,
> baolu
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)
Powered by blists - more mailing lists