[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z9SLwcWCMfmtwDZA@x1.local>
Date: Fri, 14 Mar 2025 16:04:17 -0400
From: Peter Xu <peterx@...hat.com>
To: Nikita Kalyazin <kalyazin@...zon.com>
Cc: James Houghton <jthoughton@...gle.com>, akpm@...ux-foundation.org,
pbonzini@...hat.com, shuah@...nel.org, kvm@...r.kernel.org,
linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, lorenzo.stoakes@...cle.com, david@...hat.com,
ryan.roberts@....com, quic_eberman@...cinc.com, graf@...zon.de,
jgowans@...zon.com, roypat@...zon.co.uk, derekmn@...zon.com,
nsaenz@...zon.es, xmarcalx@...zon.com
Subject: Re: [RFC PATCH 0/5] KVM: guest_memfd: support for uffd missing
On Fri, Mar 14, 2025 at 05:12:35PM +0000, Nikita Kalyazin wrote:
> Anyway, it looks like the solution we discussed allows to choose between
> memcpy-only and memcpy/write-combined userspace implementations. I'm going
> to work on the next version of the series that would include MINOR trap and
> avoiding KVM dependency in mm via calling vm_ops->fault() in
> UFFDIO_CONTINUE.
I'll attach some more context, not directly relevant to this series, but
just FYI.
One thing I am not yet sure is whether ultimately we still need to register
userfaultfd with another fd using offset ranges. The problem is whether
there will be userfaultfd trapping demand on the pure private CoCo use case
later. The only thing I'm not sure is if all guest-memfd use cases allow
mmap(). If true, then maybe we can stick with the current UFFDIO_REGISTER
on VA ranges.
In all cases, I think you can proceed with whatever you plan to do to add
initial guest-memfd userfaultfd supports, as long as acceptable from KVM
list.
The other thing is, what you're looking for indeed looks very close to what
we may need. We want to have purely shared guest-memfd working just like
vanilla memfd_create(), not only for 4K but for huge pages. We also want
GUP working, so it can replace the old hugetlbfs use case.
I had a feeling that all the directions of guest-memfd recently happening
on the list will ultimately need huge pages. It would be the same for you
maybe, but only that your use case does not allow any permanant mapping
that is visible to the kernel. Probably that's why GUP is forbidden but
kmap isn't in your write()s; please bare with me if I made things wrong, I
don't understand your use case well.
Just in case helpful, I have some PoC branches ready allowing 1G pages to
be mapped to userspace.
https://github.com/xzpeter/linux/commits/peter-gmem-v0.2/
The work is based on Ackerley's 1G series, which contains most of the folio
management part (but I fixed quite a few bugs in my tree; I believe
Ackerley should have them fixed in his to-be-posted too). I also have a
QEMU branch ready that can boot with it (I didn't yet test more things).
https://github.com/xzpeter/qemu/commits/peter-gmem-v0.2/
For example, besides guest-memfd alone, we definitely also need guest-memfd
being trappable by userfaultfd, as what you are trying to do here, one way
or another.
Thanks,
--
Peter Xu
Powered by blists - more mailing lists