linux-kernel - Re: [KVM timekeeping 10/35] Fix deep C-state TSC desynchronization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4C907F3D.6070709@web.de>
Date:	Wed, 15 Sep 2010 10:09:33 +0200
From:	Jan Kiszka <jan.kiszka@....de>
To:	Zachary Amsden <zamsden@...hat.com>
CC:	kvm@...r.kernel.org, Avi Kivity <avi@...hat.com>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	Glauber Costa <glommer@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	John Stultz <johnstul@...ibm.com>, linux-kernel@...r.kernel.org
Subject: Re: [KVM timekeeping 10/35] Fix deep C-state TSC desynchronization

Am 15.09.2010 06:07, Zachary Amsden wrote:
> On 09/13/2010 11:10 PM, Jan Kiszka wrote:
>> Am 20.08.2010 10:07, Zachary Amsden wrote:
>>   
>>> When CPUs with unstable TSCs enter deep C-state, TSC may stop
>>> running.  This causes us to require resynchronization.  Since
>>> we can't tell when this may potentially happen, we assume the
>>> worst by forcing re-compensation for it at every point the VCPU
>>> task is descheduled.
>>>
>>> Signed-off-by: Zachary Amsden<zamsden@...hat.com>
>>> ---
>>>   arch/x86/kvm/x86.c |    2 +-
>>>   1 files changed, 1 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 7fc4a55..52b6c21 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -1866,7 +1866,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu,
>>> int cpu)
>>>       }
>>>
>>>       kvm_x86_ops->vcpu_load(vcpu, cpu);
>>> -    if (unlikely(vcpu->cpu != cpu)) {
>>> +    if (unlikely(vcpu->cpu != cpu) || check_tsc_unstable()) {
>>>           /* Make sure TSC doesn't go backwards */
>>>           s64 tsc_delta = !vcpu->arch.last_host_tsc ? 0 :
>>>                   native_read_tsc() - vcpu->arch.last_host_tsc;
>>>      
>> For yet unknown reason, this commit breaks Linux guests here if they are
>> started with only a single VCPU. They hang during boot, obviously no
>> longer receiving interrupts.
>>
>> I'm using kvm-kmod against a 2.6.34 host kernel, so this may be a side
>> effect of the wrapping, though I cannot imagine how.
>>
>> Anyone any ideas?
>>    
> 
> Question: how did you come to the knowledge that this is the commit
> which breaks things?  I'm assuming you bisected, in which case a
> transition from stable -> unstable would have only happened once.

Right.

>  This
> also means the PM suspend event which you observed only happened once,
> so obviously if you bisected successfully, there is a bug which doesn't
> involved the PM transition or the stable -> unstable transition.

Right, see my other posting.

> 
> Your host TSC must have desynchronized during the PM transition, and
> this change compensates the TSC on an unstable host to effectively show
> run time, not real time.  Perhaps the lack of catchup code (to catch
> back up to real time) is triggering the bug.

I'm still unsure if KVM is right in declaring the TSC unstable. It looks
like Linux is less picky here - are the requirements different?

> 
> In any case, I'll proceed with the forcing of unstable TSC and HPET
> clocksource and see what happens.

I tried that before, but it did not trigger the issue that kvm-clock
guests no longer boot properly. This only happens if the TSC is marked
unstable.

Jan


Download attachment "signature.asc" of type "application/pgp-signature" (260 bytes)