[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211119160023.GI876299@ziepe.ca>
Date: Fri, 19 Nov 2021 12:00:23 -0400
From: Jason Gunthorpe <jgg@...pe.ca>
To: David Hildenbrand <david@...hat.com>
Cc: Chao Peng <chao.p.peng@...ux.intel.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
linux-fsdevel@...r.kernel.org, qemu-devel@...gnu.org,
Paolo Bonzini <pbonzini@...hat.com>,
Jonathan Corbet <corbet@....net>,
Sean Christopherson <seanjc@...gle.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
x86@...nel.org, "H . Peter Anvin" <hpa@...or.com>,
Hugh Dickins <hughd@...gle.com>,
Jeff Layton <jlayton@...nel.org>,
"J . Bruce Fields" <bfields@...ldses.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Yu Zhang <yu.c.zhang@...ux.intel.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
luto@...nel.org, john.ji@...el.com, susie.li@...el.com,
jun.nakajima@...el.com, dave.hansen@...el.com, ak@...ux.intel.com
Subject: Re: [RFC v2 PATCH 01/13] mm/shmem: Introduce F_SEAL_GUEST
On Fri, Nov 19, 2021 at 04:39:15PM +0100, David Hildenbrand wrote:
> > If qmeu can put all the guest memory in a memfd and not map it, then
> > I'd also like to see that the IOMMU can use this interface too so we
> > can have VFIO working in this configuration.
>
> In QEMU we usually want to (and must) be able to access guest memory
> from user space, with the current design we wouldn't even be able to
> temporarily mmap it -- which makes sense for encrypted memory only. The
> corner case really is encrypted memory. So I don't think we'll see a
> broad use of this feature outside of encrypted VMs in QEMU. I might be
> wrong, most probably I am :)
Interesting..
The non-encrypted case I had in mind is the horrible flow in VFIO to
support qemu re-execing itself (VFIO_DMA_UNMAP_FLAG_VADDR).
Here VFIO is connected to a VA in a mm_struct that will become invalid
during the kexec period, but VFIO needs to continue to access it. For
IOMMU cases this is OK because the memory is already pinned, but for
the 'emulated iommu' used by mdevs pages are pinned dynamically. qemu
needs to ensure that VFIO can continue to access the pages across the
kexec, even though there is nothing to pin_user_pages() on.
This flow would work a lot better if VFIO was connected to the memfd
that is storing the guest memory. Then it naturally doesn't get
disrupted by exec() and we don't need the mess in the kernel..
I was wondering if we could get here using the direct_io APIs but this
would do the job too.
> Apart from the special "encrypted memory" semantics, I assume nothing
> speaks against allowing for mmaping these memfds, for example, for any
> other VFIO use cases.
We will eventually have VFIO with "encrypted memory". There was a talk
in LPC about the enabling work for this.
So, if the plan is to put fully encrpyted memory inside a memfd, then
we still will eventually need a way to pull the pfns it into the
IOMMU, presumably along with the access control parameters needed to
pass to the secure monitor to join a PCI device to the secure memory.
Jason
Powered by blists - more mailing lists