[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4ACB5E1A.8000407@redhat.com>
Date: Tue, 06 Oct 2009 17:11:22 +0200
From: Avi Kivity <avi@...hat.com>
To: Dan Magenheimer <dan.magenheimer@...cle.com>
CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@...rix.com>,
Xen-devel <xen-devel@...ts.xensource.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
kurt.hackel@...cle.com, Keir Fraser <keir.fraser@...citrix.com>,
Glauber de Oliveira Costa <gcosta@...hat.com>,
zach.brown@...cle.com, the arch/x86 maintainers <x86@...nel.org>,
chris.mason@...cle.com
Subject: Re: [PATCH 3/5] x86/pvclock: add vsyscall implementation
On 10/06/2009 04:19 PM, Dan Magenheimer wrote:
>> From: Jeremy Fitzhardinge [mailto:jeremy.fitzhardinge@...rix.com]
>> With this in place, I can do a gettimeofday in about 100ns on a 2.4GHz
>> Q6600. I'm sure this could be tuned a bit more, but it is
>> already much better than a syscall.
>>
> To evaluate the goodness of this, we really need a full
> set of measurements for:
>
> a) cost of rdtsc (and rdtscp if different)
> b) cost of vsyscall+pvclock
> c) cost of rdtsc emulated
> d) cost of a hypercall that returns "hypervisor system time"
>
> On a E6850 (3Ghz but let's use cycles), I measured;
>
> a == 72 cycles
> c == 1080 cycles
> d == 780 cycles
>
> It may be partly apples and oranges, but it looks
> like a good guess for b on my machine is
>
> b == 240 cycles
>
Two rdtscps should suffice (and I think they are much faster on modern
machines).
> Not bad, but is there any additional context switch
> cost to support it?
>
rdtscp requires an additional msr read/write on heavyweight host context
switches. Should be negligible compared to the savings.
>> From: Avi Kivity [mailto:avi@...hat.com]
>> Instead of using vgetcpu() and rdtsc() independently, you can
>> use rdtscp
>> to read both atomically. This removes the need for the
>> preempt notifier.
>>
> Xen does not currently expose rdtscp and so does not emulate
> (or context switch) TSC_AUX. Context switching TSC_AUX
> is certainly possible, but will likely be expensive.
> If the primary reason for vsyscall+pvclock is to maximize
> performance for gettimeofday/clock_gettime, this cost
> would need to be added to the mix.
>
It will cost ~100 cycles on heavyweight host context switch
(guest-to-guest).
>> preempt notifiers are per-thread, not global, and will upset
>> the cycle
>> counters. I'd drop them and use rdtscp instead (and give up if the
>> processor doesn't support it).
>>
> Even if rdtscp is used, in the Intel processor lineup
> only the very latest (Nehalem) supports rdtscp, so
> "give up" doesn't seem like a very good option, at least
> in the near future.
>
Why not? we still fall back to the guest kernel. By the time guest
kernels with rdtscp support are in the field, these machines will be
quiet old.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists