[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54AE7D4F.2040300@cantab.net>
Date: Thu, 08 Jan 2015 12:51:27 +0000
From: David Vrabel <dvrabel@...tab.net>
To: Andy Lutomirski <luto@...capital.net>,
Paolo Bonzini <pbonzini@...hat.com>,
Marcelo Tosatti <mtosatti@...hat.com>
CC: Gleb Natapov <gleb@...nel.org>,
"xen-devel@...ts.xenproject.org" <xen-devel@...ts.xenproject.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
kvm list <kvm@...r.kernel.org>
Subject: Re: [Xen-devel] [RFC 2/2] x86, vdso, pvclock: Simplify and speed
up the vdso pvclock reader
On 23/12/2014 00:39, Andy Lutomirski wrote:
> The pvclock vdso code was too abstracted to understand easily and
> excessively paranoid. Simplify it for a huge speedup.
>
> This opens the door for additional simplifications, as the vdso no
> longer accesses the pvti for any vcpu other than vcpu 0.
>
> Before, vclock_gettime using kvm-clock took about 64ns on my machine.
> With this change, it takes 19ns, which is almost as fast as the pure TSC
> implementation.
Xen guests don't use any of this at the moment, and I don't think this
change would prevent us from using it in the future, so:
Acked-by: David Vrabel <david.vrabel@...rix.com>
But see some additional comments below.
> --- a/arch/x86/vdso/vclock_gettime.c
> +++ b/arch/x86/vdso/vclock_gettime.c
> @@ -78,47 +78,59 @@ static notrace const struct pvclock_vsyscall_time_info *get_pvti(int cpu)
>
> static notrace cycle_t vread_pvclock(int *mode)
> {
> - const struct pvclock_vsyscall_time_info *pvti;
> + const struct pvclock_vcpu_time_info *pvti = &get_pvti(0)->pvti;
Xen updates pvti when scheduling a VCPU. Using 0 here requires that
VCPU 0 has been recently scheduled by Xen. Perhaps using the current
CPU here would be better? It doesn't matter if the task is subsequently
moved to a different CPU before using pvti.
> + * Note: The kernel and hypervisor must guarantee that cpu ID
> + * number maps 1:1 to per-CPU pvclock time info.
> + *
> + * Because the hypervisor is entirely unaware of guest userspace
> + * preemption, it cannot guarantee that per-CPU pvclock time
> + * info is updated if the underlying CPU changes or that that
> + * version is increased whenever underlying CPU changes.
> + *
> + * On KVM, we are guaranteed that pvti updates for any vCPU are
> + * atomic as seen by *all* vCPUs. This is an even stronger
> + * guarantee than we get with a normal seqlock.
> *
> + * On Xen, we don't appear to have that guarantee, but Xen still
> + * supplies a valid seqlock using the version field.
> +
> + * We only do pvclock vdso timing at all if
> + * PVCLOCK_TSC_STABLE_BIT is set, and we interpret that bit to
> + * mean that all vCPUs have matching pvti and that the TSC is
> + * synced, so we can just look at vCPU 0's pvti.
I think this is a much stronger requirement than you actually need.
You only require:
- the system time (pvti->system_time) for all pvti's is synchronized; and
- TSC is synchronized; and
- the pvti has been updated sufficiently recently (so the error in the
result is within acceptable margins).
Can you add documentation to arch/x86/include/asm/pvclock-abi.h to
describe what properties PVCLOCK_TSC_STABLE_BIT guarantees?
David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists