[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <MWHPR11MB1886BD0790C148E88C5C795E8C3B9@MWHPR11MB1886.namprd11.prod.outlook.com>
Date: Fri, 4 Jun 2021 02:15:54 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: David Gibson <david@...son.dropbear.id.au>,
Jean-Philippe Brucker <jean-philippe@...aro.org>,
"Jiang, Dave" <dave.jiang@...el.com>,
"Raj, Ashok" <ashok.raj@...el.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
Jonathan Corbet <corbet@....net>,
David Woodhouse <dwmw2@...radead.org>,
Jason Wang <jasowang@...hat.com>,
LKML <linux-kernel@...r.kernel.org>,
"Kirti Wankhede" <kwankhede@...dia.com>,
"Alex Williamson (alex.williamson@...hat.com)"
<alex.williamson@...hat.com>,
"iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
"Robin Murphy" <robin.murphy@....com>
Subject: RE: [RFC] /dev/ioasid uAPI proposal
> From: Jason Gunthorpe <jgg@...dia.com>
> Sent: Thursday, June 3, 2021 7:47 PM
>
> On Thu, Jun 03, 2021 at 06:49:20AM +0000, Tian, Kevin wrote:
> > > From: David Gibson
> > > Sent: Thursday, June 3, 2021 1:09 PM
> > [...]
> > > > > In this way the SW mode is the same as a HW mode with an infinite
> > > > > cache.
> > > > >
> > > > > The collaposed shadow page table is really just a cache.
> > > > >
> > > >
> > > > OK. One additional thing is that we may need a 'caching_mode"
> > > > thing reported by /dev/ioasid, indicating whether invalidation is
> > > > required when changing non-present to present. For hardware
> > > > nesting it's not reported as the hardware IOMMU will walk the
> > > > guest page table in cases of iotlb miss. For software nesting
> > > > caching_mode is reported so the user must issue invalidation
> > > > upon any change in guest page table so the kernel can update
> > > > the shadow page table timely.
> > >
> > > For the fist cut, I'd have the API assume that invalidates are
> > > *always* required. Some bypass to avoid them in cases where they're
> > > not needed can be an additional extension.
> > >
> >
> > Isn't a typical TLB semantics is that non-present entries are not
> > cached thus invalidation is not required when making non-present
> > to present? It's true to both CPU TLB and IOMMU TLB. In reality
> > I feel there are more usages built on hardware nesting than software
> > nesting thus making default following hardware TLB behavior makes
> > more sense...
>
> From a modelling perspective it makes sense to have the most general
> be the default and if an implementation can elide certain steps then
> describing those as additional behaviors on the universal baseline is
> cleaner
>
> I'm surprised to hear your remarks about the not-present though,
> how does the vIOMMU emulation work if there are not hypervisor
> invalidation traps for not-present/present transitions?
>
Such invalidation traps matter only for shadow I/O page table (software
nesting). For hardware nesting no trap is required for non-present/
present transition since physical IOTLB doesn't cache non-present
entries. The IOMMU will walk the guest I/O page table in case of IOTLB
miss.
The vIOMMU should be constructed according to whether software
or hardware nesting is used. For Intel (and AMD iirc), a caching_mode
capability decides whether the guest needs to do invalidation for
non-present/present transition. Such vIOMMU should clear this bit
for hardware nesting or set it for software nesting. ARM SMMU doesn't
have this capability. Therefore their vSMMU can only work with a
hardware nested IOASID.
Thanks
Kevin
Powered by blists - more mailing lists