[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210601174229.GP1002214@nvidia.com>
Date: Tue, 1 Jun 2021 14:42:29 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: "Tian, Kevin" <kevin.tian@...el.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
Joerg Roedel <joro@...tes.org>,
Lu Baolu <baolu.lu@...ux.intel.com>,
David Woodhouse <dwmw2@...radead.org>,
"iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"Alex Williamson (alex.williamson@...hat.com)"
<alex.williamson@...hat.com>, Jason Wang <jasowang@...hat.com>,
Eric Auger <eric.auger@...hat.com>,
Jonathan Corbet <corbet@....net>,
"Raj, Ashok" <ashok.raj@...el.com>,
"Liu, Yi L" <yi.l.liu@...el.com>, "Wu, Hao" <hao.wu@...el.com>,
"Jiang, Dave" <dave.jiang@...el.com>,
Jacob Pan <jacob.jun.pan@...ux.intel.com>,
Jean-Philippe Brucker <jean-philippe@...aro.org>,
David Gibson <david@...son.dropbear.id.au>,
Kirti Wankhede <kwankhede@...dia.com>,
Robin Murphy <robin.murphy@....com>
Subject: Re: [RFC] /dev/ioasid uAPI proposal
On Tue, Jun 01, 2021 at 08:10:14AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@...dia.com>
> > Sent: Saturday, May 29, 2021 1:36 AM
> >
> > On Thu, May 27, 2021 at 07:58:12AM +0000, Tian, Kevin wrote:
> >
> > > IOASID nesting can be implemented in two ways: hardware nesting and
> > > software nesting. With hardware support the child and parent I/O page
> > > tables are walked consecutively by the IOMMU to form a nested translation.
> > > When it's implemented in software, the ioasid driver is responsible for
> > > merging the two-level mappings into a single-level shadow I/O page table.
> > > Software nesting requires both child/parent page tables operated through
> > > the dma mapping protocol, so any change in either level can be captured
> > > by the kernel to update the corresponding shadow mapping.
> >
> > Why? A SW emulation could do this synchronization during invalidation
> > processing if invalidation contained an IOVA range.
>
> In this proposal we differentiate between host-managed and user-
> managed I/O page tables. If host-managed, the user is expected to use
> map/unmap cmd explicitly upon any change required on the page table.
> If user-managed, the user first binds its page table to the IOMMU and
> then use invalidation cmd to flush iotlb when necessary (e.g. typically
> not required when changing a PTE from non-present to present).
>
> We expect user to use map+unmap and bind+invalidate respectively
> instead of mixing them together. Following this policy, map+unmap
> must be used in both levels for software nesting, so changes in either
> level are captured timely to synchronize the shadow mapping.
map+unmap or bind+invalidate is a policy of the IOASID itself set when
it is created. If you put two different types in a tree then each IOASID
must continue to use its own operation mode.
I don't see a reason to force all IOASIDs in a tree to be consistent??
A software emulated two level page table where the leaf level is a
bound page table in guest memory should continue to use
bind/invalidate to maintain the guest page table IOASID even though it
is a SW construct.
The GPA level should use map/unmap because it is a kernel owned page
table
Though how to efficiently mix map/unmap on the GPA when there are SW
nested levels below it looks to be quite challenging.
Jason
Powered by blists - more mailing lists