[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4CFB8BFA.4040100@redhat.com>
Date: Sun, 05 Dec 2010 14:56:26 +0200
From: Avi Kivity <avi@...hat.com>
To: Rik van Riel <riel@...hat.com>
CC: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Ingo Molnar <mingo@...e.hu>,
Anthony Liguori <aliguori@...ux.vnet.ibm.com>
Subject: Re: [RFC PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin
On 12/02/2010 09:45 PM, Rik van Riel wrote:
> Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic
> slowdowns of certain workloads, we instead use yield_to to hand
> the rest of our timeslice to another vcpu in the same KVM guest.
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 80f17db..a6eeafc 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1880,18 +1880,53 @@ void kvm_resched(struct kvm_vcpu *vcpu)
> }
> EXPORT_SYMBOL_GPL(kvm_resched);
>
> -void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu)
> +void kvm_vcpu_on_spin(struct kvm_vcpu *me)
> {
> - ktime_t expires;
> - DEFINE_WAIT(wait);
> + struct kvm *kvm = me->kvm;
> + struct kvm_vcpu *vcpu;
> + int last_boosted_vcpu = me->kvm->last_boosted_vcpu;
> + int first_round = 1;
> + int i;
>
> - prepare_to_wait(&vcpu->wq,&wait, TASK_INTERRUPTIBLE);
> + me->spinning = 1;
> +
> + /*
> + * We boost the priority of a VCPU that is runnable but not
> + * currently running, because it got preempted by something
> + * else and called schedule in __vcpu_run. Hopefully that
> + * VCPU is holding the lock that we need and will release it.
> + * We approximate round-robin by starting at the last boosted VCPU.
> + */
> + again:
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + struct task_struct *task = vcpu->task;
> + if (first_round&& i< last_boosted_vcpu) {
> + i = last_boosted_vcpu;
> + continue;
> + } else if (!first_round&& i> last_boosted_vcpu)
> + break;
> + if (vcpu == me)
> + continue;
> + if (vcpu->spinning)
> + continue;
You may well want to wake up a spinner. Suppose
A takes a lock
B preempts A
B grabs a ticket, starts spinning, yields to A
A releases lock
A grabs ticket, starts spinning
at this point, we want A to yield to B, but it won't because of this check.
> + if (!task)
> + continue;
> + if (waitqueue_active(&vcpu->wq))
> + continue;
> + if (task->flags& PF_VCPU)
> + continue;
> + kvm->last_boosted_vcpu = i;
> + yield_to(task);
> + break;
> + }
I think a random selection algorithm will be a better fit against
special guest behaviour.
>
> - /* Sleep for 100 us, and hope lock-holder got scheduled */
> - expires = ktime_add_ns(ktime_get(), 100000UL);
> - schedule_hrtimeout(&expires, HRTIMER_MODE_ABS);
> + if (first_round&& last_boosted_vcpu == kvm->last_boosted_vcpu) {
> + /* We have not found anyone yet. */
> + first_round = 0;
> + goto again;
Need to guarantee termination.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists