[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z5DQuNn3GoU1vYPV@google.com>
Date: Wed, 22 Jan 2025 11:04:24 +0000
From: Mostafa Saleh <smostafa@...gle.com>
To: "Tian, Kevin" <kevin.tian@...el.com>
Cc: Jason Gunthorpe <jgg@...pe.ca>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"kvmarm@...ts.linux.dev" <kvmarm@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>,
"catalin.marinas@....com" <catalin.marinas@....com>,
"will@...nel.org" <will@...nel.org>,
"maz@...nel.org" <maz@...nel.org>,
"oliver.upton@...ux.dev" <oliver.upton@...ux.dev>,
"joey.gouly@....com" <joey.gouly@....com>,
"suzuki.poulose@....com" <suzuki.poulose@....com>,
"yuzenghui@...wei.com" <yuzenghui@...wei.com>,
"robdclark@...il.com" <robdclark@...il.com>,
"joro@...tes.org" <joro@...tes.org>,
"robin.murphy@....com" <robin.murphy@....com>,
"jean-philippe@...aro.org" <jean-philippe@...aro.org>,
"nicolinc@...dia.com" <nicolinc@...dia.com>,
"vdonnefort@...gle.com" <vdonnefort@...gle.com>,
"qperret@...gle.com" <qperret@...gle.com>,
"tabba@...gle.com" <tabba@...gle.com>,
"danielmentz@...gle.com" <danielmentz@...gle.com>,
"tzukui@...gle.com" <tzukui@...gle.com>
Subject: Re: [RFC PATCH v2 00/58] KVM: Arm SMMUv3 driver for pKVM
On Fri, Jan 17, 2025 at 06:57:12AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@...pe.ca>
> > Sent: Friday, January 17, 2025 3:15 AM
> >
> > On Thu, Jan 16, 2025 at 06:39:31AM +0000, Tian, Kevin wrote:
> > > > From: Mostafa Saleh <smostafa@...gle.com>
> > > > Sent: Wednesday, January 8, 2025 8:10 PM
> > > >
> > > > On Thu, Jan 02, 2025 at 04:16:14PM -0400, Jason Gunthorpe wrote:
> > > > > On Fri, Dec 13, 2024 at 07:39:04PM +0000, Mostafa Saleh wrote:
> > > > > > Yeah, SVA is tricky, I guess for that we would have to use nesting,
> > > > > > but tbh, I don’t think it’s a deal breaker for now.
> > > > >
> > > > > Again, it depends what your actual use case for translation is inside
> > > > > the host/guest environments. It would be good to clearly spell this out..
> > > > > There are few drivers that directly manpulate the iommu_domains of a
> > > > > device. a few gpus, ath1x wireless, some tegra stuff, "venus". Which
> > > > > of those are you targetting?
> > > > >
> > > >
> > > > Not sure I understand this point about manipulating domains.
> > > > AFAIK, SVA is not that common, including mobile spaces but I can be
> > wrong,
> > > > that’s why it’s not a priority here.
> > >
> > > Nested translation is required beyond SVA. A scenario which requires
> > > a vIOMMU and multiple device domains within the guest would like to
> > > embrace nesting. Especially for ARM vSMMU nesting is a must.
We can still do para-virtualization for guests the same way we do for the
host and use a single stage IOMMU.
> >
> > Right, if you need an iommu domain in the guest there are only three
> > mainstream ways to get this in Linux:
> > 1) Use the DMA API and have the iommu group be translating. This is
> > optional in that the DMA API usually supports identity as an option.
> > 2) A driver directly calls iommu_paging_domain_alloc() and manually
> > attaches it to some device, and does not use the DMA API. My list
> > above of ath1x/etc are examples doing this
> > 3) Use VFIO
> >
> > My remark to Mostafa is to be specific, which of the above do you want
> > to do in your mobile guest (and what driver exactly if #2) and why.
> >
> > This will help inform what the performance profile looks like and
> > guide if nesting/para virt is appropriate.
>
AFAIK, the most common use cases would be:
- Devices using DMA API because it requires a lot of memory to be
contiguous in IOVA, which is hard to do with identity
- Devices with security requirements/constraints to be isolated from the
rest of the system, also using DMA API
- VFIO is something we are looking at the moment and have prototyped with
pKVM, and it should be supported soon in Android (only for platform
devices for now)
> Yeah that part would be critical to help decide which route to pursue
> first. Even when all options might be required in the end when pKVM
> is scaled to more scenarios, as you mentioned in another mail, a staging
> approach would be much preferrable to evolve.
I agree that would probably be the case. I will work on more staging
approach for v3, mostly without the pv part as Jason suggested.
>
> The pros/cons between nesting/para virt is clear - more static the
> mapping is, more gain from the para approach due to less paging
> walking and smaller tlb footprint, while vice versa nesting performs
> much better by avoiding frequent para calls on page table mgmt. 😊
I am also working to get the numbers for both cases so we know
the order of magnitude of each case, as I guess it won't be as clear
for large systems with many DMA initiators what approach is best.
Thanks,
Mostafa
>
> >
> > > But I'm not sure that I got Jason's point about " there is no way to get
> > > SVA support with para-virtualization." virtio-iommu is a para-virtualized
> > > model and SVA support is in its plan. The main requirement is to pass
> > > the base pointer of the guest CPU page table to backend and PRI faults/
> > > responses back forth.
> >
> > That's nesting, you have a full page table under the control of the
> > guest, and the guest needs to have a level of HW-specific
> > knowledge. It is just an alternative to using the native nesting
> > vIOMMU.
> >
> > What I mean by "para-virtualization" is the guest does map/unmap calls
> > to the hypervisor and has no page tbale.
> >
>
> Yes, that should never happen for SVA.
Powered by blists - more mailing lists