[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADrL8HXjLjVyFiFee9Q58TQ9zBfXiO+VG=25Rw4UD+fbDmxQFg@mail.gmail.com>
Date: Wed, 28 May 2025 11:48:09 -0400
From: James Houghton <jthoughton@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Jonathan Corbet <corbet@....net>, Marc Zyngier <maz@...nel.org>,
Oliver Upton <oliver.upton@...ux.dev>, Yan Zhao <yan.y.zhao@...el.com>,
Nikita Kalyazin <kalyazin@...zon.com>, Anish Moorthy <amoorthy@...gle.com>,
Peter Gonda <pgonda@...gle.com>, Peter Xu <peterx@...hat.com>,
David Matlack <dmatlack@...gle.com>, wei.w.wang@...el.com, kvm@...r.kernel.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.linux.dev,
Jiaqi Yan <jiaqiyan@...gle.com>
Subject: Re: [PATCH v2 00/13] KVM: Introduce KVM Userfault
On Tue, May 6, 2025 at 8:13 PM Sean Christopherson <seanjc@...gle.com> wrote:
>
> On Thu, Jan 09, 2025, James Houghton wrote:
> > KVM: Add KVM_MEM_USERFAULT memslot flag and bitmap
> > KVM: Add KVM_MEMORY_EXIT_FLAG_USERFAULT
> > KVM: Allow late setting of KVM_MEM_USERFAULT on guest_memfd memslot
> > KVM: Advertise KVM_CAP_USERFAULT in KVM_CHECK_EXTENSION
> > KVM: x86/mmu: Add support for KVM_MEM_USERFAULT
> > KVM: arm64: Add support for KVM_MEM_USERFAULT
> > KVM: selftests: Fix vm_mem_region_set_flags docstring
> > KVM: selftests: Fix prefault_mem logic
> > KVM: selftests: Add va_start/end into uffd_desc
> > KVM: selftests: Add KVM Userfault mode to demand_paging_test
> > KVM: selftests: Inform set_memory_region_test of KVM_MEM_USERFAULT
> > KVM: selftests: Add KVM_MEM_USERFAULT + guest_memfd toggle tests
> > KVM: Documentation: Add KVM_CAP_USERFAULT and KVM_MEM_USERFAULT
> > details
> >
> > Documentation/virt/kvm/api.rst | 33 +++-
> > arch/arm64/kvm/Kconfig | 1 +
> > arch/arm64/kvm/mmu.c | 26 +++-
> > arch/x86/kvm/Kconfig | 1 +
> > arch/x86/kvm/mmu/mmu.c | 27 +++-
> > arch/x86/kvm/mmu/mmu_internal.h | 20 ++-
> > arch/x86/kvm/x86.c | 36 +++--
> > include/linux/kvm_host.h | 19 ++-
> > include/uapi/linux/kvm.h | 6 +-
> > .../selftests/kvm/demand_paging_test.c | 145 ++++++++++++++++--
> > .../testing/selftests/kvm/include/kvm_util.h | 5 +
> > .../selftests/kvm/include/userfaultfd_util.h | 2 +
> > tools/testing/selftests/kvm/lib/kvm_util.c | 42 ++++-
> > .../selftests/kvm/lib/userfaultfd_util.c | 2 +
> > .../selftests/kvm/set_memory_region_test.c | 33 ++++
> > virt/kvm/Kconfig | 3 +
> > virt/kvm/kvm_main.c | 54 ++++++-
> > 17 files changed, 419 insertions(+), 36 deletions(-)
>
> I didn't look at the selftests changes, but nothing in this series scares me. We
> bikeshedded most of this death this in the "exit on missing" series, so for me at
> least, the only real question is whether or not we want to add the uAPI. AFAIK,
> this is best proposal for post-copy guest_memfd support (and not just because it's
> the only proposal :-D).
The only thing that I want to call out again is that this UAPI works
great for when we are going from userfault --> !userfault. That is, it
works well for postcopy (both for guest_memfd and for standard
memslots where userfaultfd scalability is a concern).
But there is another use case worth bringing up: unmapping pages that
the VMM is emulating as poisoned.
Normally this can be handled by mm (e.g. with UFFDIO_POISON), but for
4K poison within a HugeTLB-backed memslot (if the HugeTLB page remains
mapped in userspace), KVM Userfault is the only option (if we don't
want to punch holes in memslots). This leaves us with three problems:
1. If using KVM Userfault to emulate poison, we are stuck with small
pages in stage 2 for the entire memslot.
2. We must unmap everything when toggling on KVM Userfault just to
unmap a single page.
3. If KVM Userfault is already enabled, we have no choice but to
toggle KVM Userfault off and on again to unmap the newly poisoned
pages (i.e., there is no ioctl to scan the bitmap and unmap
newly-userfault pages).
All of these are non-issues if we emulate poison by removing memslots,
and I think that's possible. But if that proves too slow, we'd need to
be a little bit more clever with hugepage recovery and with unmapping
newly-userfault pages, both of which I think can be solved by adding
some kind of bitmap re-scan ioctl. We can do that later if the need
arises.
> So... yes?
Thanks Sean!
> Attached are a variation on the series using the common "struct kvm_page_fault"
> idea. The documentation change could be squashed with the final enablement patch.
>
> Compile tested only. I would not be the least bit surprised if I completely
> butchered something.
Looks good! The new selftests work just fine.
Powered by blists - more mailing lists