[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4C904685.9090402@redhat.com>
Date: Tue, 14 Sep 2010 18:07:33 -1000
From: Zachary Amsden <zamsden@...hat.com>
To: Jan Kiszka <jan.kiszka@...mens.com>
CC: kvm@...r.kernel.org, Avi Kivity <avi@...hat.com>,
Marcelo Tosatti <mtosatti@...hat.com>,
Glauber Costa <glommer@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
John Stultz <johnstul@...ibm.com>, linux-kernel@...r.kernel.org
Subject: Re: [KVM timekeeping 10/35] Fix deep C-state TSC desynchronization
On 09/13/2010 11:10 PM, Jan Kiszka wrote:
> Am 20.08.2010 10:07, Zachary Amsden wrote:
>
>> When CPUs with unstable TSCs enter deep C-state, TSC may stop
>> running. This causes us to require resynchronization. Since
>> we can't tell when this may potentially happen, we assume the
>> worst by forcing re-compensation for it at every point the VCPU
>> task is descheduled.
>>
>> Signed-off-by: Zachary Amsden<zamsden@...hat.com>
>> ---
>> arch/x86/kvm/x86.c | 2 +-
>> 1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 7fc4a55..52b6c21 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -1866,7 +1866,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>> }
>>
>> kvm_x86_ops->vcpu_load(vcpu, cpu);
>> - if (unlikely(vcpu->cpu != cpu)) {
>> + if (unlikely(vcpu->cpu != cpu) || check_tsc_unstable()) {
>> /* Make sure TSC doesn't go backwards */
>> s64 tsc_delta = !vcpu->arch.last_host_tsc ? 0 :
>> native_read_tsc() - vcpu->arch.last_host_tsc;
>>
> For yet unknown reason, this commit breaks Linux guests here if they are
> started with only a single VCPU. They hang during boot, obviously no
> longer receiving interrupts.
>
> I'm using kvm-kmod against a 2.6.34 host kernel, so this may be a side
> effect of the wrapping, though I cannot imagine how.
>
> Anyone any ideas?
>
Question: how did you come to the knowledge that this is the commit
which breaks things? I'm assuming you bisected, in which case a
transition from stable -> unstable would have only happened once. This
also means the PM suspend event which you observed only happened once,
so obviously if you bisected successfully, there is a bug which doesn't
involved the PM transition or the stable -> unstable transition.
Your host TSC must have desynchronized during the PM transition, and
this change compensates the TSC on an unstable host to effectively show
run time, not real time. Perhaps the lack of catchup code (to catch
back up to real time) is triggering the bug.
In any case, I'll proceed with the forcing of unstable TSC and HPET
clocksource and see what happens.
Zach
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists