linux-kernel - Re: [KVM timekeeping 10/35] Fix deep C-state TSC desynchronization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4C904685.9090402@redhat.com>
Date:	Tue, 14 Sep 2010 18:07:33 -1000
From:	Zachary Amsden <zamsden@...hat.com>
To:	Jan Kiszka <jan.kiszka@...mens.com>
CC:	kvm@...r.kernel.org, Avi Kivity <avi@...hat.com>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	Glauber Costa <glommer@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	John Stultz <johnstul@...ibm.com>, linux-kernel@...r.kernel.org
Subject: Re: [KVM timekeeping 10/35] Fix deep C-state TSC desynchronization

On 09/13/2010 11:10 PM, Jan Kiszka wrote:
> Am 20.08.2010 10:07, Zachary Amsden wrote:
>    
>> When CPUs with unstable TSCs enter deep C-state, TSC may stop
>> running.  This causes us to require resynchronization.  Since
>> we can't tell when this may potentially happen, we assume the
>> worst by forcing re-compensation for it at every point the VCPU
>> task is descheduled.
>>
>> Signed-off-by: Zachary Amsden<zamsden@...hat.com>
>> ---
>>   arch/x86/kvm/x86.c |    2 +-
>>   1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 7fc4a55..52b6c21 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -1866,7 +1866,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>>   	}
>>
>>   	kvm_x86_ops->vcpu_load(vcpu, cpu);
>> -	if (unlikely(vcpu->cpu != cpu)) {
>> +	if (unlikely(vcpu->cpu != cpu) || check_tsc_unstable()) {
>>   		/* Make sure TSC doesn't go backwards */
>>   		s64 tsc_delta = !vcpu->arch.last_host_tsc ? 0 :
>>   				native_read_tsc() - vcpu->arch.last_host_tsc;
>>      
> For yet unknown reason, this commit breaks Linux guests here if they are
> started with only a single VCPU. They hang during boot, obviously no
> longer receiving interrupts.
>
> I'm using kvm-kmod against a 2.6.34 host kernel, so this may be a side
> effect of the wrapping, though I cannot imagine how.
>
> Anyone any ideas?
>    

Question: how did you come to the knowledge that this is the commit 
which breaks things?  I'm assuming you bisected, in which case a 
transition from stable -> unstable would have only happened once.  This 
also means the PM suspend event which you observed only happened once, 
so obviously if you bisected successfully, there is a bug which doesn't 
involved the PM transition or the stable -> unstable transition.

Your host TSC must have desynchronized during the PM transition, and 
this change compensates the TSC on an unstable host to effectively show 
run time, not real time.  Perhaps the lack of catchup code (to catch 
back up to real time) is triggering the bug.

In any case, I'll proceed with the forcing of unstable TSC and HPET 
clocksource and see what happens.

Zach
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/