[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y8mBczFH/Hw6xot0@ziepe.ca>
Date: Thu, 19 Jan 2023 13:44:19 -0400
From: Jason Gunthorpe <jgg@...pe.ca>
To: "Kalra, Ashish" <ashish.kalra@....com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
joro@...tes.org, robin.murphy@....com, thomas.lendacky@....com,
vasant.hegde@....com, jon.grimm@....com
Subject: Re: [PATCH 1/4] iommu/amd: Introduce Protection-domain flag VFIO
On Thu, Jan 19, 2023 at 02:54:43AM -0600, Kalra, Ashish wrote:
> Hello Jason,
>
> On 1/13/2023 9:33 AM, Jason Gunthorpe wrote:
> > On Tue, Jan 10, 2023 at 08:31:34AM -0600, Suravee Suthikulpanit wrote:
> > > Currently, to detect if a domain is enabled with VFIO support, the driver
> > > checks if the domain has devices attached and check if the domain type is
> > > IOMMU_DOMAIN_UNMANAGED.
> >
> > NAK
> >
> > If you need weird HW specific stuff like this then please implement it
> > properly in iommufd, not try and randomly guess what things need from
> > the domain type.
> >
> > All this confidential computing stuff needs a comprehensive solution,
> > not some piecemeal mess. How can you even use a CC guest with VFIO in
> > the upstream kernel? Hmm?
> >
>
> Currently all guest devices are untrusted - whether they are emulated,
> virtio or passthrough. In the current use case of VFIO device-passthrough to
> an SNP guest, the pass-through device will perform DMA to un-encrypted or
> shared guest memory, in the same way as virtio or emulated devices.
>
> This fix is prompted by an issue reported by Nvidia, they are trying to do
> PCIe device passthrough to SNP guest. The memory allocated for DMA is
> through dma_alloc_coherent() in the SNP guest and during DMA I/O an
> RMP_PAGE_FAULT is observed on the host.
>
> These dma_alloc_coherent() calls map into page state change hypercalls into
> the host to change guest page state from encrypted to shared in the RMP
> table.
>
> Following is a link to issue discussed above:
> https://github.com/AMDESE/AMDSEV/issues/109
Wow you should really write all of this in the commmit message
> Now, to set individual 4K entries to different shared/private
> mappings in NPT or host page tables for large page entries, the RMP
> and NPT/host page table large page entries are split to 4K pte’s.
Why are mappings to private pages even in the iommu in the first
place - and how did they even get there?
I thought the design for the private memory was walling it off in a
memfd and making it un-gup'able?
This seems to be your actual problem, somehow the iommu is being
loaded with private memory PFNs instead of only being loaded with
shared PFNs when shared mappings are created?
If the IOMMU mappings actually only extend to the legitimate shared
pages then you don't have a problem with large IOPTEs spanning a
mixture of page types.
> The fix is to force 4K page size for IOMMU page tables for SNP guests.
But even if you want to persue this as the fix, it should not be done
in this way.
> This patch-set adds support to detect if a domain belongs to an SNP-enabled
> guest. This way it can set default page size of a domain to 4K only for
> SNP-enabled guest and allow non-SNP guest to use larger page size.
As I said, the KVM has nothing to do with the iommu and I want to
laregly keep it that way.
If the VMM needs to request a 4k page size only iommu_domain because
it is somehow mapping mixtures of private and public pages, then the
VMM knows it is doing this crazy thing and it needs to ask iommufd
directly for customized iommu_domain from the driver.
No KVM interconnection.
In fact, we already have a way to do this in iommufd generically, have
the VMM set IOMMU_OPTION_HUGE_PAGES = 0.
Jason
Powered by blists - more mailing lists