[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <BN9PR11MB52760686ABB3145E44BAD95A8C1A2@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Thu, 16 Jan 2025 08:51:11 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Mostafa Saleh <smostafa@...gle.com>, Jason Gunthorpe <jgg@...pe.ca>
CC: "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>, "kvmarm@...ts.linux.dev"
<kvmarm@...ts.linux.dev>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>, "catalin.marinas@....com"
<catalin.marinas@....com>, "will@...nel.org" <will@...nel.org>,
"maz@...nel.org" <maz@...nel.org>, "oliver.upton@...ux.dev"
<oliver.upton@...ux.dev>, "joey.gouly@....com" <joey.gouly@....com>,
"suzuki.poulose@....com" <suzuki.poulose@....com>, "yuzenghui@...wei.com"
<yuzenghui@...wei.com>, "robdclark@...il.com" <robdclark@...il.com>,
"joro@...tes.org" <joro@...tes.org>, "robin.murphy@....com"
<robin.murphy@....com>, "jean-philippe@...aro.org"
<jean-philippe@...aro.org>, "nicolinc@...dia.com" <nicolinc@...dia.com>,
"vdonnefort@...gle.com" <vdonnefort@...gle.com>, "qperret@...gle.com"
<qperret@...gle.com>, "tabba@...gle.com" <tabba@...gle.com>,
"danielmentz@...gle.com" <danielmentz@...gle.com>, "tzukui@...gle.com"
<tzukui@...gle.com>
Subject: RE: [RFC PATCH v2 00/58] KVM: Arm SMMUv3 driver for pKVM
> From: Mostafa Saleh <smostafa@...gle.com>
> Sent: Wednesday, January 8, 2025 8:10 PM
>
> On Thu, Jan 02, 2025 at 04:16:14PM -0400, Jason Gunthorpe wrote:
> > On Fri, Dec 13, 2024 at 07:39:04PM +0000, Mostafa Saleh wrote:
> > > I am open to any suggestions, but I believe any solution considered for
> > > merge, should have enough features to be usable on actual systems
> (translating
> > > IOMMU can be used for example) so either para-virt as this series or full
> > > nesting as the PoC above (or maybe both?), which IMO comes down to
> the
> > > trade-off mentioned above.
> >
> > IMHO no, you can have a completely usable solution without host/guest
> > controlled translation. This is equivilant to a bare metal system with
> > no IOMMU HW. This exists and is still broadly useful. The majority of
> > cloud VMs out there are in this configuration.
> >
> > That is the simplest/smallest thing to start with. Adding host/guest
> > controlled translation is a build-on-top excercise that seems to have
> > a lot of options and people may end up wanting to do all of them.
> >
> > I don't think you need to show that host/guest controlled translation
> > is possible to make progress, of course it is possible. Just getting
> > to the point where pkvm can own the SMMU HW and provide DMA
> isolation
> > between all of it's direct host/guest is a good step.
>
> My plan was basically:
> 1) Finish and send nested SMMUv3 as RFC, with more insights about
> performance and complexity trade-offs of both approaches.
>
> 2) Discuss next steps for the upstream solution in an upcoming conference
> (like LPC or earlier if possible) and work on upstreaming it.
>
> 3) Work on guest device passthrough and IOMMU support.
>
> I am open to gradually upstream this as you mentioned where as a first
> step pKVM would establish DMA isolation without translation for host,
> that should be enough to have functional pKVM and run protected
> workloads.
Does that approach assume starting from a full-fledged SMMU driver
inside pKVM or do we still expect the host to enumerate/initialize
the hw (but skip any translation) so the pKVM part can focus only
on managing translation?
I'm curious about the burden of maintaining another IOMMU
subsystem under the KVM directory. It's not built into the host kernel
image, but hosted in the same kernel repo. This series tried to
reduce the duplication via io-pgtable-arm but still considerable
duplication exists (~2000LOC in pKVM). The would be very confusing
moving forward and hard to maintain e.g. ensure bugs fixed in
both sides.
The CPU side is a different story. iiuc KVM-ARM is a split driver model
from day one for nVHE. It's kept even for VHE with difference only
on using hypercall vs using direct function call. pKVM is added
incrementally on top of nVHE hence it's natural to maintain the
pKVM logic in the kernel repo. No duplication.
But there is no such thing in the IOMMU side. Probably we'd want to
try reusing the entire IOMMU sub-system in pKVM if it's agreed
to use full-fledged drivers in pKVM. Or if continuing the split-driver
model should we try splitting the existing drivers into two parts then
connecting two together via function call on native and via hypercall
in pKVM (similar to how KVM-ARM does)?
Thanks
Kevin
Powered by blists - more mailing lists