linux-kernel - Re: kvmclock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <4731C4D1.7040101@qumranet.com>
Date:	Wed, 07 Nov 2007 15:59:45 +0200
From:	Avi Kivity <avi@...ranet.com>
To:	Glauber de Oliveira Costa <gcosta@...hat.com>
CC:	linux-kernel@...r.kernel.org, jeremy@...p.org, aliguori@...ibm.com,
	kvm-devel@...ts.sourceforge.net, hollisb@...ibm.com
Subject: Re: kvmclock - the host part.

Glauber de Oliveira Costa wrote:
>>>  void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu)
>>>  {
>>> +	vcpu->time_needs_update = 1;
>>>   
>>>       
>> Why here and not in __vcpu_run()?  It isn't timer irq related.
>>     
> Because my plan was exactly, updating it at each timer interrupt.
>   

I think kvm_inject_pending_timer_irqs() is called every __vcpu_run(), so 
your cunning plan has been foiled.

Did you mean each guest interrupt of host interrupt?

> There's a trade off between
> updating every run (hopefully more precision, but more overhead), versus
> updating at timer irqs, or other events.
>
> What would you prefer?
>   

I think that we should update it every time a heavyweight exit has been 
taken.  That takes care of the tradeoff quite nicely -- heavyweight 
exits are already dog slow.

>   
>
>>> +	/* Updating the tsc count is the first thing we do */
>>> +	kvm_get_msr(vcpu, MSR_IA32_TIME_STAMP_COUNTER, &vcpu->hv_clock.last_tsc);
>>> +	ktime_get_ts(&ts);
>>> +	vcpu->hv_clock.now_ns = ts.tv_nsec + (NSEC_PER_SEC * (u64)ts.tv_sec);
>>> +	vcpu->hv_clock.wc_sec = get_seconds();
>>> +	vcpu->hv_clock.version++;
>>> +
>>> +	clock_addr = vcpu->clock_addr;
>>> +	memcpy(clock_addr, &vcpu->hv_clock, sizeof(vcpu->hv_clock));
>>> +	mark_page_dirty(vcpu->kvm, vcpu->clock_gfn);
>>>   
>>>       
>> Just use kvm_write_guest().
>>     
> Too slow. Updating guest time, even only in timer interrupts, was a too
> frequent operation, and the kmap / kunmap (atomic) at every iteration
> deemed the whole thing
> unusable.
>   

kvm_write_guest() will eventually be a copy_to_user(), so you need not 
fear the overhead.


>>>  
>>> +	ret = 0;
>>>  	switch (nr) {
>>> +	case  KVM_HCALL_REGISTER_CLOCK: {
>>> +		struct kvm_vcpu *dst_vcpu;
>>> +
>>> +		if (!((a1 < KVM_MAX_VCPUS) && (vcpu->kvm->vcpus[a1]))) {
>>> +			ret = -KVM_EINVAL;
>>> +			break;
>>> +		}
>>> +
>>> +		dst_vcpu = vcpu->kvm->vcpus[a1];
>>>   
>>>       
>> What if !dst_vcpu?  What about locking?
>>
>> Suggest simply using vcpu.  Every guest cpu can register its own
>>     
> Earlier version had a check for !dst_vcpu, you are absolutely right.
>
> Locking was not a problem in practice, because these operations are done
>  serialized, by the same cpu.
>   

Think evil guest that cares not for the well-being of the host.

> This hypercall is called by cpu_up, which, at least in the beginning,
> it's called by cpu0. And that's why each vcpu cannot register its own.
> (And why we don't need locking).
>
> Well, theorectically each vcpu do can register its own clocksource, it
> will just be a little bit more complicated, we have to fire out an IPI,
> and have the other cpu to catch it, and call the hypercall.
>
>   

Can it not be done via the processor startup sequence?  Then there's no 
need for ipis and locking.

I imagine a normal guest initializes the apic in the same way.

> But I honestly don't like it.
> Usually, the cpu leaves start_secondary with a clock already registered,
> so the kernel relies on it.
>
>   
>>> +		dst_vcpu->clock_page = gfn_to_page(vcpu->kvm, a0 >> PAGE_SHIFT);
>>>   
>>>       
>> Shift right?  Why?
>>     
> a0 is not a gfn, but a physical address.
>   

What if the guest wants to place it in address 5GB?  That's unlikely for 
Linux and Windows, but let's do it right anyway.

-- 
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/