[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aMfMk/x5XJ1bfvzv@yzhao56-desk.sh.intel.com>
Date: Mon, 15 Sep 2025 16:21:39 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: Sean Christopherson <seanjc@...gle.com>
CC: <pbonzini@...hat.com>, <reinette.chatre@...el.com>,
<rick.p.edgecombe@...el.com>, <linux-kernel@...r.kernel.org>,
<kvm@...r.kernel.org>
Subject: Re: [PATCH v2 3/3] KVM: selftests: Test prefault memory during
concurrent memslot removal
On Mon, Sep 08, 2025 at 04:47:23PM -0700, Sean Christopherson wrote:
> On Fri, Aug 22, 2025, Yan Zhao wrote:
> > .../selftests/kvm/pre_fault_memory_test.c | 94 +++++++++++++++----
> > 1 file changed, 78 insertions(+), 16 deletions(-)
> >
> > diff --git a/tools/testing/selftests/kvm/pre_fault_memory_test.c b/tools/testing/selftests/kvm/pre_fault_memory_test.c
> > index 0350a8896a2f..56e65feb4c8c 100644
> > --- a/tools/testing/selftests/kvm/pre_fault_memory_test.c
> > +++ b/tools/testing/selftests/kvm/pre_fault_memory_test.c
> > @@ -10,12 +10,16 @@
> > #include <test_util.h>
> > #include <kvm_util.h>
> > #include <processor.h>
> > +#include <pthread.h>
> >
> > /* Arbitrarily chosen values */
> > #define TEST_SIZE (SZ_2M + PAGE_SIZE)
> > #define TEST_NPAGES (TEST_SIZE / PAGE_SIZE)
> > #define TEST_SLOT 10
> >
> > +static bool prefault_ready;
> > +static bool delete_thread_ready;
> > +
> > static void guest_code(uint64_t base_gpa)
> > {
> > volatile uint64_t val __used;
> > @@ -30,17 +34,47 @@ static void guest_code(uint64_t base_gpa)
> > GUEST_DONE();
> > }
> >
> > -static void pre_fault_memory(struct kvm_vcpu *vcpu, u64 gpa, u64 size,
> > - u64 left)
> > +static void *remove_slot_worker(void *data)
> > +{
> > + struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data;
> > +
> > + WRITE_ONCE(delete_thread_ready, true);
> > +
> > + while (!READ_ONCE(prefault_ready))
> > + cpu_relax();
> > +
> > + vm_mem_region_delete(vcpu->vm, TEST_SLOT);
> > +
> > + WRITE_ONCE(delete_thread_ready, false);
>
> Rather than use global variables, which necessitates these "dances" to get things
> back to the initial state, use an on-stack structure to communicate (and obviously
> make sure the structure is initialized :-D).
Sorry for the late reply.
Indeed, this makes the code more elegant!
> > + return NULL;
> > +}
> > +
> > +static void pre_fault_memory(struct kvm_vcpu *vcpu, u64 base_gpa, u64 offset,
> > + u64 size, u64 left, bool private, bool remove_slot)
> > {
> > struct kvm_pre_fault_memory range = {
> > - .gpa = gpa,
> > + .gpa = base_gpa + offset,
> > .size = size,
> > .flags = 0,
> > };
> > - u64 prev;
> > + pthread_t remove_thread;
> > + bool remove_hit = false;
> > int ret, save_errno;
> > + u64 prev;
> >
> > + if (remove_slot) {
>
> I don't see any reason to make the slot removal conditional. There are three
> things we're interested in testing (so far):
>
> 1. Success
> 2. ENOENT due to no memslot
> 3. EAGAIN due to INVALID memslot
>
> #1 and #2 are mutually exclusive, or rather easier to test via separate testcases
> (because writing to non-existent memory is trivial). But for #3, I don't see a
> reason to make it mutually exclusive with #1 _or_ #2.
>
> As written, it's always mutually exclusive with #2 because otherwise it would be
> difficult (impossible?) to determine if KVM exited on the "right" address. But
> the only reason that's true is because the test recreates the slot *after*
> prefaulting, and _that_ makes #3 _conditionally_ mutually exclusive with #1,
> i.e. the test doesn't validate success if the INVALID memslot race is hit.
>
> Rather than make everything mutually exclusive, just restore the memslot and
> retry prefaulting. That also gives us easy bonus coverage that doing
> KVM_PRE_FAULT_MEMORY on memory that has already been faulted in is idempotent,
> i.e. that KVM_PRE_FAULT_MEMORY succeeds if it already succeeded (and nothing
> nuked the mappings in the interim).
That's a good idea.
> If the memslot is restored and the loop retries, then #3 becomes a complimentary
> and orthogonal testcase to #1 and #2.
>
> This? (with an opportunistic s/left/expected_left that confused me; I thought
> "left" meant how many bytes were left to prefault, but it actually means how many
> bytes are expected to be left when failure occurs).
Looks good to me, except for a minor bug.
> + if (!slot_recreated) {
> + WRITE_ONCE(data.recreate_slot, true);
> + pthread_join(slot_worker, NULL);
> + slot_recreated = true;
> + continue;
If delete_slot_worker() invokes vm_mem_region_delete() slowly enough due to
scheduling delays, the return value from __vcpu_ioctl() could be 0 with
range.size being 0 at this point.
What about checking range.size before continuing?
@@ -120,7 +126,8 @@ static void pre_fault_memory(struct kvm_vcpu *vcpu, u64 base_gpa, u64 offset,
WRITE_ONCE(data.recreate_slot, true);
pthread_join(slot_worker, NULL);
slot_recreated = true;
- continue;
+ if (range.size)
+ continue;
}
Otherwise, the next __vcpu_ioctl() would return -1 with errno == EINVAL, which
will break the assertion below.
> + /*
> + * Assert success if prefaulting the entire range should succeed, i.e.
> + * complete with no bytes remaining. Otherwise prefaulting should have
> + * failed due to ENOENT (due to RET_PF_EMULATE for emulated MMIO when
> + * no memslot exists).
> + */
> + if (!expected_left)
> + TEST_ASSERT_VM_VCPU_IOCTL(!ret, KVM_PRE_FAULT_MEMORY, ret, vcpu->vm);
Powered by blists - more mailing lists