linux-kernel - Re: [PATCH 16/17] TSC reset compensation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4C181B93.2020009@redhat.com>
Date:	Tue, 15 Jun 2010 14:32:19 -1000
From:	Zachary Amsden <zamsden@...hat.com>
To:	Marcelo Tosatti <mtosatti@...hat.com>
CC:	avi@...hat.com, glommer@...hat.com, kvm@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 16/17] TSC reset compensation

On 06/15/2010 02:27 PM, Marcelo Tosatti wrote:
> On Mon, Jun 14, 2010 at 09:34:18PM -1000, Zachary Amsden wrote:
>    
>> Attempt to synchronize TSCs which are reset to the same value.  In the
>> case of a reliable hardware TSC, we can just re-use the same offset, but
>> on non-reliable hardware, we can get closer by adjusting the offset to
>> match the elapsed time.
>>
>> Signed-off-by: Zachary Amsden<zamsden@...hat.com>
>> ---
>>   arch/x86/kvm/x86.c |   34 ++++++++++++++++++++++++++++++++--
>>   1 files changed, 32 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 8e836e9..cedb71f 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -937,14 +937,44 @@ static inline void kvm_request_guest_time_update(struct kvm_vcpu *v)
>>   	set_bit(KVM_REQ_CLOCK_SYNC,&v->requests);
>>   }
>>
>> +static inline int kvm_tsc_reliable(void)
>> +{
>> +	return (boot_cpu_has(X86_FEATURE_CONSTANT_TSC)&&
>> +		boot_cpu_has(X86_FEATURE_NONSTOP_TSC)&&
>> +		!check_tsc_unstable());
>> +}
>> +
>>   void guest_write_tsc(struct kvm_vcpu *vcpu, u64 data)
>>   {
>>   	struct kvm *kvm = vcpu->kvm;
>> -	u64 offset;
>> +	u64 offset, ns, elapsed;
>>
>>   	spin_lock(&kvm->arch.tsc_write_lock);
>>   	offset = data - native_read_tsc();
>> -	kvm->arch.last_tsc_nsec = get_kernel_ns();
>> +	ns = get_kernel_ns();
>> +	elapsed = ns - kvm->arch.last_tsc_nsec;
>> +
>> +	/*
>> +	 * Special case: identical write to TSC within 5 seconds of
>> +	 * another CPU is interpreted as an attempt to synchronize
>> +	 * (the 5 seconds is to accomodate host load / swapping).
>> +	 *
>> +	 * In that case, for a reliable TSC, we can match TSC offsets,
>> +	 * or make a best guest using kernel_ns value.
>> +	 */
>> +	if (data == kvm->arch.last_tsc_write&&  elapsed<  5 * NSEC_PER_SEC) {
>> +		if (kvm_tsc_reliable()) {
>> +			offset = kvm->arch.last_tsc_offset;
>> +			pr_debug("kvm: matched tsc offset for %llu\n", data);
>> +		} else {
>> +			u64 tsc_delta = elapsed * __get_cpu_var(cpu_tsc_khz);
>> +			tsc_delta = tsc_delta / USEC_PER_SEC;
>> +			offset -= tsc_delta;
>> +			pr_debug("kvm: adjusted tsc offset by %llu\n", tsc_delta);
>> +		}
>> +		ns = kvm->arch.last_tsc_nsec;
>> +	}
>> +	kvm->arch.last_tsc_nsec = ns;
>>   	kvm->arch.last_tsc_write = data;
>>   	kvm->arch.last_tsc_offset = offset;
>>   	kvm_x86_ops->write_tsc_offset(vcpu, offset);
>> -- 
>>      
> Could extend this to handle migration.
>    

Also, this could be extended to cover the kvmclock variables themselves; 
then, if tsc is reliable, we need not ever recalibrate the kvmclock.  In 
fact, all VMs would have the same parameters for kvmclock in that case, 
just with a different kvm->arch.kvmclock_offset.

Zach
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/