[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zj3ksShWaFSWstii@gmail.com>
Date: Fri, 10 May 2024 02:11:13 -0700
From: Breno Leitao <leitao@...ian.org>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, rbc@...a.com, paulmck@...nel.org,
"open list:KERNEL VIRTUAL MACHINE (KVM)" <kvm@...r.kernel.org>,
open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] KVM: Addressing a possible race in kvm_vcpu_on_spin:
Hello Sean,
On Thu, May 09, 2024 at 09:42:48AM -0700, Sean Christopherson wrote:
> On Thu, May 09, 2024, Breno Leitao wrote:
> > kvm_vcpu_set_in_spin_loop(me, true);
> > /*
> > * We boost the priority of a VCPU that is runnable but not
> > @@ -4109,7 +4110,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
> >
> > yielded = kvm_vcpu_yield_to(vcpu);
> > if (yielded > 0) {
> > - kvm->last_boosted_vcpu = i;
> > + WRITE_ONCE(kvm->last_boosted_vcpu, i);
> > break;
> > } else if (yielded < 0) {
> > try--;
>
> Side topic #1: am I the only one that finds these loops unnecessarily hard to
> read?
No. :-)
In fact, when I skimmed over the code, I though that maybe the code was
not covering the vCPUs before last_boosted_vcpu in the array.
Now that I am looking at it carefully, the code is using `pass` to track
if the vCPU passed last_boosted_vcpu in the index.
> Unless I'm misreading the code, it's really just an indirect way of looping
> over all vCPUs, starting at last_boosted_vcpu+1 and the wrapping.
>
> IMO, reworking it to be like this is more straightforward:
>
> int nr_vcpus, start, i, idx, yielded;
> struct kvm *kvm = me->kvm;
> struct kvm_vcpu *vcpu;
> int try = 3;
>
> nr_vcpus = atomic_read(&kvm->online_vcpus);
> if (nr_vcpus < 2)
> return;
>
> /* Pairs with the smp_wmb() in kvm_vm_ioctl_create_vcpu(). */
> smp_rmb();
Why do you need this now? Isn't the RCU read lock in xa_load() enough?
> kvm_vcpu_set_in_spin_loop(me, true);
>
> start = READ_ONCE(kvm->last_boosted_vcpu) + 1;
> for (i = 0; i < nr_vcpus; i++) {
Why do you need to started at the last boosted vcpu? I.e, why not
starting at 0 and skipping me->vcpu_idx and kvm->last_boosted_vcpu?
> idx = (start + i) % nr_vcpus;
> if (idx == me->vcpu_idx)
> continue;
>
> vcpu = xa_load(&kvm->vcpu_array, idx);
> if (!READ_ONCE(vcpu->ready))
> continue;
> if (kvm_vcpu_is_blocking(vcpu) && !vcpu_dy_runnable(vcpu))
> continue;
>
> /*
> * Treat the target vCPU as being in-kernel if it has a pending
> * interrupt, as the vCPU trying to yield may be spinning
> * waiting on IPI delivery, i.e. the target vCPU is in-kernel
> * for the purposes of directed yield.
> */
> if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode &&
> !kvm_arch_dy_has_pending_interrupt(vcpu) &&
> !kvm_arch_vcpu_preempted_in_kernel(vcpu))
> continue;
>
> if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
> continue;
>
> yielded = kvm_vcpu_yield_to(vcpu);
> if (yielded > 0) {
> WRITE_ONCE(kvm->last_boosted_vcpu, i);
> break;
> } else if (yielded < 0 && !--try) {
> break;
> }
> }
>
> kvm_vcpu_set_in_spin_loop(me, false);
>
> /* Ensure vcpu is not eligible during next spinloop */
> kvm_vcpu_set_dy_eligible(me, false);
I didn't tested it, but I reviewed it, and it seems sane and way easier
to read. I agree this code is easier to read, from someone that has
little KVM background.
Powered by blists - more mailing lists