[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240809194335.1726916-1-seanjc@google.com>
Date: Fri, 9 Aug 2024 12:43:12 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
Oliver Upton <oliver.upton@...ux.dev>, Marc Zyngier <maz@...nel.org>, Peter Xu <peterx@...hat.com>,
James Houghton <jthoughton@...gle.com>
Subject: [PATCH 00/22] KVM: x86/mmu: Allow yielding on mmu_notifier zap
The main intent of this series is to allow yielding, i.e. cond_resched(),
when unmapping memory in shadow MMUs in response to an mmu_notifier
invalidation. There is zero reason not to yield, and in fact I _thought_
KVM did yield, but because of how KVM grew over the years, the unmap path
got left behind.
The first half of the series is reworks max_guest_memory_test into
mmu_stress_test, to give some confidence in the mmu_notifier-related
changes.
Oliver and Marc, there's on patch lurking in here to enable said test on
arm64. It's as well tested as I can make it (and that took much longer
than anticipated because arm64 hit races in the test that x86 doesn't
for whatever reason).
The middle of the series reworks x86's shadow MMU logic to use the
zap flow that can yield.
The last third or so is a wee bit adventurous, and is kinda of an RFC, but
well tested. It's essentially prep/post work for James' MGLRU, and allows
aging SPTEs in x86's shadow MMU to run outside of mmu_lock, e.g. so that
nested TDP (stage-2) MMUs can participate in MGLRU.
If everything checks out, my goal is to land the selftests and yielding
changes in 6.12. The aging stuff is incomplete and meaningless without
James' MGLRU, I'm posting it here purely so that folks can see the end
state when the mmu_notifier invalidation paths also moves to a different
API.
James, the aging stuff is quite well tested (see below). Can you try
working into/on-top of your MGLRU series? And if you're feeling very
kind, hammer it a bit more? :-) I haven't looked at the latest ideas
and/or discussion on the MGLRU series, but I'm hoping that being able to
support the shadow MMU (absent the stupid eptad=0 case) in MGLRU will
allow for few shenanigans, e.g. no need to toggle flags during runtime.
As for testing, I spun up a VM and ran a compilation loop and `stress` in
the VM, while simultaneously running a small userspace program to age the
VM's memory (also in an infinite loop), using the same basic methodology as
access_tracking_perf_test.c (I put almost all of guest memory into a
memfd and then aged only that range of memory).
I confirmed that the locking does work, e.g. that there was (infrequent)
contention, and am fairly confident that the idea pans out. E.g. I hit
the BUG_ON(!is_shadow_present_pte()) using that setup, which is the only
reason those patches exist :-)
Sean Christopherson (22):
KVM: selftests: Check for a potential unhandled exception iff KVM_RUN
succeeded
KVM: selftests: Rename max_guest_memory_test to mmu_stress_test
KVM: selftests: Only muck with SREGS on x86 in mmu_stress_test
KVM: selftests: Compute number of extra pages needed in
mmu_stress_test
KVM: selftests: Enable mmu_stress_test on arm64
KVM: selftests: Use vcpu_arch_put_guest() in mmu_stress_test
KVM: selftests: Precisely limit the number of guest loops in
mmu_stress_test
KVM: selftests: Add a read-only mprotect() phase to mmu_stress_test
KVM: selftests: Verify KVM correctly handles mprotect(PROT_READ)
KVM: x86/mmu: Move walk_slot_rmaps() up near
for_each_slot_rmap_range()
KVM: x86/mmu: Plumb a @can_yield parameter into __walk_slot_rmaps()
KVM: x86/mmu: Add a helper to walk and zap rmaps for a memslot
KVM: x86/mmu: Honor NEED_RESCHED when zapping rmaps and blocking is
allowed
KVM: x86/mmu: Morph kvm_handle_gfn_range() into an aging specific
helper
KVM: x86/mmu: Fold mmu_spte_age() into kvm_rmap_age_gfn_range()
KVM: x86/mmu: Add KVM_RMAP_MANY to replace open coded '1' and '1ul'
literals
KVM: x86/mmu: Refactor low level rmap helpers to prep for walking w/o
mmu_lock
KVM: x86/mmu: Use KVM_PAGES_PER_HPAGE() instead of an open coded
equivalent
KVM: x86/mmu: Add infrastructure to allow walking rmaps outside of
mmu_lock
KVM: x86/mmu: Add support for lockless walks of rmap SPTEs
KVM: x86/mmu: Support rmap walks without holding mmu_lock when aging
gfns
***HACK*** KVM: x86: Don't take mmu_lock when aging gfns
arch/x86/kvm/mmu/mmu.c | 527 +++++++++++-------
arch/x86/kvm/svm/svm.c | 2 +
arch/x86/kvm/vmx/vmx.c | 2 +
tools/testing/selftests/kvm/Makefile | 3 +-
tools/testing/selftests/kvm/lib/kvm_util.c | 3 +-
..._guest_memory_test.c => mmu_stress_test.c} | 144 ++++-
virt/kvm/kvm_main.c | 7 +-
7 files changed, 482 insertions(+), 206 deletions(-)
rename tools/testing/selftests/kvm/{max_guest_memory_test.c => mmu_stress_test.c} (65%)
base-commit: 332d2c1d713e232e163386c35a3ba0c1b90df83f
--
2.46.0.76.ge559c4bf1a-goog
Powered by blists - more mailing lists