[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1348502600.11847.90.camel@twins>
Date: Mon, 24 Sep 2012 18:03:20 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Avi Kivity <avi@...hat.com>
Cc: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>,
"H. Peter Anvin" <hpa@...or.com>,
Marcelo Tosatti <mtosatti@...hat.com>,
Ingo Molnar <mingo@...hat.com>, Rik van Riel <riel@...hat.com>,
Srikar <srikar@...ux.vnet.ibm.com>,
"Nikunj A. Dadhania" <nikunj@...ux.vnet.ibm.com>,
KVM <kvm@...r.kernel.org>, Jiannan Ouyang <ouyang@...pitt.edu>,
chegu vinod <chegu_vinod@...com>,
"Andrew M. Theurer" <habanero@...ux.vnet.ibm.com>,
LKML <linux-kernel@...r.kernel.org>,
Srivatsa Vaddagiri <srivatsa.vaddagiri@...il.com>,
Gleb Natapov <gleb@...hat.com>,
Andrew Jones <drjones@...hat.com>
Subject: Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios
in PLE handler
On Mon, 2012-09-24 at 17:51 +0200, Avi Kivity wrote:
> On 09/24/2012 03:54 PM, Peter Zijlstra wrote:
> > On Mon, 2012-09-24 at 18:59 +0530, Raghavendra K T wrote:
> >> However Rik had a genuine concern in the cases where runqueue is not
> >> equally distributed and lockholder might actually be on a different run
> >> queue but not running.
> >
> > Load should eventually get distributed equally -- that's what the
> > load-balancer is for -- so this is a temporary situation.
>
> What's the expected latency? This is the whole problem. Eventually the
> scheduler would pick the lock holder as well, the problem is that it's
> in the millisecond scale while lock hold times are in the microsecond
> scale, leading to a 1000x slowdown.
Yeah I know.. Heisenberg's uncertainty applied to SMP computing becomes
something like accurate or fast, never both.
> If we want to yield, we really want to boost someone.
Now if only you knew which someone ;-) This non-modified guest nonsense
is such a snake pit.. but you know how I feel about all that.
> > We already try and favour the non running vcpu in this case, that's what
> > yield_to_task_fair() is about. If its still not eligible to run, tough
> > luck.
>
> Crazy idea: instead of yielding, just run that other vcpu in the thread
> that would otherwise spin. I can see about a million objections to this
> already though.
Yah.. you want me to list a few? :-) It would require synchronization
with the other cpu to pull its task -- one really wants to avoid it also
running it.
Do this at a high enough frequency and you're dead too.
Anyway, you can do this inside the KVM stuff, simply flip the vcpu state
associated with a vcpu thread and use the preemption notifiers to sort
things against the scheduler or somesuch.
> >> Do you think instead of using rq->nr_running, we could get a global
> >> sense of load using avenrun (something like avenrun/num_onlinecpus)
> >
> > To what purpose? Also, global stuff is expensive, so you should try and
> > stay away from it as hard as you possibly can.
>
> Spinning is also expensive. How about we do the global stuff every N
> times, to amortize the cost (and reduce contention)?
Nah, spinning isn't expensive, its a waste of time, similar end result
for someone who wants to do useful work though, but not the same cause.
Pick N and I'll come up with a scenario for which its wrong ;-)
Anyway, its an ugly problem and one I really want to contain inside the
insanity that created it (virt), lets not taint the rest of the kernel
more than we need to.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists