[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aNQltowMx51v42Bw@google.com>
Date: Wed, 24 Sep 2025 10:09:10 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Yan Zhao <yan.y.zhao@...el.com>
Cc: pbonzini@...hat.com, reinette.chatre@...el.com, rick.p.edgecombe@...el.com,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH v2 3/3] KVM: selftests: Test prefault memory during
concurrent memslot removal
On Mon, Sep 15, 2025, Sean Christopherson wrote:
> On Mon, Sep 15, 2025, Yan Zhao wrote:
> > On Mon, Sep 08, 2025 at 04:47:23PM -0700, Sean Christopherson wrote:
> > > On Fri, Aug 22, 2025, Yan Zhao wrote:
> > > + if (!slot_recreated) {
> > > + WRITE_ONCE(data.recreate_slot, true);
> > > + pthread_join(slot_worker, NULL);
> > > + slot_recreated = true;
> > > + continue;
> > If delete_slot_worker() invokes vm_mem_region_delete() slowly enough due to
> > scheduling delays, the return value from __vcpu_ioctl() could be 0 with
> > range.size being 0 at this point.
> >
> > What about checking range.size before continuing?
> >
> > @@ -120,7 +126,8 @@ static void pre_fault_memory(struct kvm_vcpu *vcpu, u64 base_gpa, u64 offset,
> > WRITE_ONCE(data.recreate_slot, true);
> > pthread_join(slot_worker, NULL);
> > slot_recreated = true;
> > - continue;
> > + if (range.size)
> > + continue;
> > }
> >
> >
> > Otherwise, the next __vcpu_ioctl() would return -1 with errno == EINVAL, which
> > will break the assertion below.
>
> Drat, I missed that kvm_vcpu_pre_fault_memory() returns -EINVAL on a size of '0'
> (see the wrong comment snippet "Either prefaulting already succeeded, in which
> case retrying should also succeed, or retry is needed to get a stable result").
>
> I'll circle back to this tomorrow. IIRC, there was a reason I didn't want to
> check range.size in that path, but for the life of me I can't remember why :-/
I'm 99% certain I was worried about false passes, but after working through the
possible scenarios, I don't see any way for bailing on !range.size to result in
missing a KVM bug. So I'll post a formal patch with the below sqaushed in.
Thanks much!
diff --git a/tools/testing/selftests/kvm/pre_fault_memory_test.c b/tools/testing/selftests/kvm/pre_fault_memory_test.c
index 2dbabf4b0b15..f04768c1d2e4 100644
--- a/tools/testing/selftests/kvm/pre_fault_memory_test.c
+++ b/tools/testing/selftests/kvm/pre_fault_memory_test.c
@@ -112,15 +112,24 @@ static void pre_fault_memory(struct kvm_vcpu *vcpu, u64 base_gpa, u64 offset,
* slot was deleted) and/or to prepare for the next testcase.
* Wait for the worker to exit so that the next invocation of
* prefaulting is guaranteed to complete (assuming no KVM bugs).
- * Always retry prefaulting to simply the retry logic. Either
- * prefaulting already succeeded, in which case retrying should
- * also succeed, or retry is needed to get a stable result.
*/
if (!slot_recreated) {
WRITE_ONCE(data.recreate_slot, true);
pthread_join(slot_worker, NULL);
slot_recreated = true;
- continue;
+
+ /*
+ * Retry prefaulting to get a stable result, i.e. to
+ * avoid seeing random EAGAIN failures. Don't retry if
+ * prefaulting already succeeded, as KVM disallows
+ * prefaulting with size=0, i.e. blindly retrying would
+ * result in test failures due to EINVAL. KVM should
+ * always return success if all bytes are prefaulted,
+ * i.e. there is no need to guard against EAGAIN being
+ * returned.
+ */
+ if (range.size)
+ continue;
}
/*
Powered by blists - more mailing lists