linux-kernel - Re: [KVM timekeeping 25/35] Add clock catchup mode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100825233805.GA2985@mothafucka.localdomain>
Date:	Wed, 25 Aug 2010 20:38:05 -0300
From:	Glauber Costa <glommer@...hat.com>
To:	Marcelo Tosatti <mtosatti@...hat.com>
Cc:	Zachary Amsden <zamsden@...hat.com>, kvm@...r.kernel.org,
	Avi Kivity <avi@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	John Stultz <johnstul@...ibm.com>, linux-kernel@...r.kernel.org
Subject: Re: [KVM timekeeping 25/35] Add clock catchup mode

On Wed, Aug 25, 2010 at 07:01:34PM -0300, Marcelo Tosatti wrote:
> On Wed, Aug 25, 2010 at 10:48:20AM -1000, Zachary Amsden wrote:
> > On 08/25/2010 07:27 AM, Marcelo Tosatti wrote:
> > >On Thu, Aug 19, 2010 at 10:07:39PM -1000, Zachary Amsden wrote:
> > >>Make the clock update handler handle generic clock synchronization,
> > >>not just KVM clock.  We add a catchup mode which keeps passthrough
> > >>TSC in line with absolute guest TSC.
> > >>
> > >>Signed-off-by: Zachary Amsden<zamsden@...hat.com>
> > >>---
> > >>  arch/x86/include/asm/kvm_host.h |    1 +
> > >>  arch/x86/kvm/x86.c              |   55 ++++++++++++++++++++++++++------------
> > >>  2 files changed, 38 insertions(+), 18 deletions(-)
> > >>
> > >>  	kvm_x86_ops->vcpu_load(vcpu, cpu);
> > >>-	if (unlikely(vcpu->cpu != cpu) || check_tsc_unstable()) {
> > >>+	if (unlikely(vcpu->cpu != cpu) || vcpu->arch.tsc_rebase) {
> > >>  		/* Make sure TSC doesn't go backwards */
> > >>  		s64 tsc_delta = !vcpu->arch.last_host_tsc ? 0 :
> > >>  				native_read_tsc() - vcpu->arch.last_host_tsc;
> > >>  		if (tsc_delta<  0)
> > >>  			mark_tsc_unstable("KVM discovered backwards TSC");
> > >>-		if (check_tsc_unstable())
> > >>+		if (check_tsc_unstable()) {
> > >>  			kvm_x86_ops->adjust_tsc_offset(vcpu, -tsc_delta);
> > >>-		kvm_migrate_timers(vcpu);
> > >>+			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
> > >>+		}
> > >>+		if (vcpu->cpu != cpu)
> > >>+			kvm_migrate_timers(vcpu);
> > >>  		vcpu->cpu = cpu;
> > >>+		vcpu->arch.tsc_rebase = 0;
> > >>  	}
> > >>  }
> > >>
> > >>@@ -1947,6 +1961,12 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > >>  	kvm_x86_ops->vcpu_put(vcpu);
> > >>  	kvm_put_guest_fpu(vcpu);
> > >>  	vcpu->arch.last_host_tsc = native_read_tsc();
> > >>+
> > >>+	/* For unstable TSC, force compensation and catchup on next CPU */
> > >>+	if (check_tsc_unstable()) {
> > >>+		vcpu->arch.tsc_rebase = 1;
> > >>+		kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
> > >>+	}
> > >The mix between catchup,trap versus stable,unstable TSC is confusing and
> > >difficult to grasp. Can you please introduce all the infrastructure
> > >first, then control usage of them in centralized places? Examples:
> > >
> > >+static void kvm_update_tsc_trapping(struct kvm *kvm)
> > >+{
> > >+       int trap, i;
> > >+       struct kvm_vcpu *vcpu;
> > >+
> > >+       trap = check_tsc_unstable()&&  atomic_read(&kvm->online_vcpus)>  1;
> > >+       kvm_for_each_vcpu(i, vcpu, kvm)
> > >+               kvm_x86_ops->set_tsc_trap(vcpu, trap&&  !vcpu->arch.time_page);
> > >+}
> > >
> > >+       /* For unstable TSC, force compensation and catchup on next CPU */
> > >+       if (check_tsc_unstable()) {
> > >+               vcpu->arch.tsc_rebase = 1;
> > >+               kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
> > >+       }
> > >
> > >
> > >kvm_guest_time_update is becoming very confusing too. I understand this
> > >is due to the many cases its dealing with, but please make it as simple
> > >as possible.
> > 
> > I tried to comment as best as I could.  I think the whole
> > "kvm_update_tsc_trapping" thing is probably a poor design choice.
> > It works, but it's thoroughly unintelligible right now without
> > spending some days figuring out why.
> > 
> > I'll rework the tail series of patches to try to make them more clear.
> > 
> > >+       /*
> > >+        * If we are trapping and no longer need to, use catchup to
> > >+        * ensure passthrough TSC will not be less than trapped TSC
> > >+        */
> > >+       if (vcpu->tsc_mode == TSC_MODE_PASSTHROUGH&&  vcpu->tsc_trapping&&
> > >+           ((this_tsc_khz<= v->kvm->arch.virtual_tsc_khz || kvmclock))) {
> > >+               catchup = 1;
> > >
> > >What, TSC trapping with kvmclock enabled?
> > 
> > Transitioning to use of kvmclock after a cold boot means we may have
> > been trapping and now we will not be.
> > 
> > >For both catchup and trapping the resolution of the host clock is
> > >important, as Glauber commented for kvmclock. Can you comment on the
> > >problems that arrive from a low res clock for both modes?
> > >
> > >Similarly for catchup mode, the effect of exit frequency. No need for
> > >any guarantees?
> > 
> > The scheduler will do something to get an IRQ at whatever resolution
> > it uses for it's timeslice.  That guarantees an exit per timeslice,
> > so we'll never be behind by more than one slice while scheduling.
> > While not scheduling, we're dormant anyway, waiting on either an IRQ
> > or shared memory variable change.  Local timers could end up behind
> > when dormant.
> > 
> > We may need a hack to accelerate firing of timers in such a case, or
> > perhaps bounds on when to use catchup mode and when to not.
> 
> What about emulating rdtsc with low res clock? 
> 
> "The RDTSC instruction reads the time-stamp counter and is guaranteed to
> return a monotonically increasing unique value whenever executed, except
> for a 64-bit counter wraparound."
> 
This is bad semantics, IMHO. It is a totally different behaviour than the
one guest users would expect.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/