[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1296575618.5081.13.camel@mothafucka.localdomain>
Date: Tue, 01 Feb 2011 13:53:38 -0200
From: Glauber Costa <glommer@...hat.com>
To: Avi Kivity <avi@...hat.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
aliguori@...ibm.com, Rik van Riel <riel@...hat.com>,
Jeremy Fitzhardinge <jeremy.fitzhardinge@...rix.com>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH v2 4/6] KVM-GST: KVM Steal time registration
On Sun, 2011-01-30 at 15:16 +0200, Avi Kivity wrote:
> On 01/28/2011 09:52 PM, Glauber Costa wrote:
> > Register steal time within KVM. Everytime we sample the steal time
> > information, we update a local variable that tells what was the
> > last time read. We then account the difference.
> >
> >
> >
> > static void kvm_guest_cpu_offline(void *dummy)
> > {
> > kvm_pv_disable_apf(NULL);
> > + native_write_msr(MSR_KVM_STEAL_TIME, 0, 0);
> > apf_task_wake_all();
> > }
>
> Don't use the native_ versions, they override the pvops implementation.
> It doesn't matter for kvm, but we're not supposed to know this.
fair.
> > + /*
> > + * using nanoseconds introduces noise, which accumulates easily
> > + * leading to big steal time values. We want, however, to keep the
> > + * interface nanosecond-based for future-proofness. The hypervisor may
> > + * adopt a similar strategy, but we can't rely on that.
> > + */
> > + delta /= NSEC_PER_MSEC;
> > + delta *= NSEC_PER_MSEC;
>
> You're working around this problem both in the guest and host. So even
> if we fix it in one, it will still be broken in the other.
And if you notice, in two different ways:
I am (was) forcing to usecs in the host, and msecs in the guest.
One of the problems here, is that if we account steal time, we refrain
from accounting user / system time. Reason being, that if we account it,
we'll end up with more than HZ ticks per HZ, since we'll account ticks
as both steal and real.
And since the granularity of the cpu accounting is too coarse, we end up
with much more steal time than we should, because things that are less
than 1 unity of cputime, are often rounded up to 1 unity of cputime.
Now, I've already said that I will investigate further, and I'm ready to
back of from all of this. But assuming my analysis is right so far, what
if we keep things in nsecs or msecs, and only convert to cputime in the
time of read? This would allow us to just subtract steal time from
user/system time, in a more fine grained way.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists