[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1409864744-26417-1-git-send-email-pbonzini@redhat.com>
Date: Thu, 4 Sep 2014 23:05:44 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Cc: chris.j.arges@...onical.com, Thomas Gleixner <tglx@...utronix.de>,
John Stultz <john.stultz@...aro.org>
Subject: [PATCH] KVM: x86: fix kvmclock breakage from timers branch merge
Commit cbcf2dd3b3d4 (x86: kvm: Make kvm_get_time_and_clockread() nanoseconds
based, 2014-07-16) used the wrong formula for boot_ns, thus breaking kvmclock on
hosts that have a reliable TSC.
To find the right formula, let's first backport the switch to nanoseconds
to 3.16-era timekeeping logic. The full patch (which works) is at
https://lkml.org/lkml/2014/9/4/462. The key line here is
boot_ns = timespec_to_ns(&tk->total_sleep_time)
+ timespec_to_ns(&tk->wall_to_monotonic)
+ tk->xtime_sec * (u64)NSEC_PER_SEC;
Because the above patch works, the conclusion is that the above formula
is not the same as commit cbcf2dd3b3d4's
boot_ns = ktime_to_ns(ktime_add(tk->tkr.base_mono, tk->offs_boot));
As to what is the right one, commit 02cba1598a2a (timekeeping: Simplify getboottime(),
2014-07-16) provides a hint:
offs_real = -wall-to_monotonic
offs_boot = total_sleep_time
offs_real - offs_boot = -wall_to_monotonic - total_sleep_time
that is
offs_boot - offs_real = wall_to_monotonic + total_sleep_time
which is what this patch uses, adding xtime_sec separately. The "boot_ns"
moniker is not too clear, so rename boot_ns to nsec_base and the existing
nsec_base to snsec_base.
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: John Stultz <john.stultz@...aro.org>
Reported-by: Chris J Arges <chris.j.arges@...onical.com>
Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
---
Thomas/John, the problem with the above explanation is that
tk_update_ktime_data has "base_mono = xtime_sec + wtm", and from
there "base_mono + offs_boot = xtime_sec + wtm + total_sleep_time".
Except that doesn't work, so something must be wrong in
tk_update_ktime_data's comment.
arch/x86/kvm/x86.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8f1e22d3b286..c55203bea337 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1020,8 +1020,8 @@ struct pvclock_gtod_data {
u32 shift;
} clock;
- u64 boot_ns;
u64 nsec_base;
+ u64 snsec_base;
};
static struct pvclock_gtod_data pvclock_gtod_data;
@@ -1031,7 +1031,7 @@ static void update_pvclock_gtod(struct timekeeper *tk)
struct pvclock_gtod_data *vdata = &pvclock_gtod_data;
u64 boot_ns;
- boot_ns = ktime_to_ns(ktime_add(tk->tkr.base_mono, tk->offs_boot));
+ boot_ns = ktime_to_ns(ktime_sub(tk->tkr.offs_boot, tk->offs_real));
write_seqcount_begin(&vdata->seq);
@@ -1042,8 +1042,9 @@ static void update_pvclock_gtod(struct timekeeper *tk)
vdata->clock.mult = tk->tkr.mult;
vdata->clock.shift = tk->tkr.shift;
- vdata->boot_ns = boot_ns;
- vdata->nsec_base = tk->tkr.xtime_nsec;
+ vdata->nsec_base = tk->xtime_sec * (u64)NSEC_PER_SEC
+ + boot_ns;
+ vdata->snsec_base = tk->tkr.xtime_nsec;
write_seqcount_end(&vdata->seq);
}
@@ -1413,10 +1414,10 @@ static int do_monotonic_boot(s64 *t, cycle_t *cycle_now)
do {
seq = read_seqcount_begin(>od->seq);
mode = gtod->clock.vclock_mode;
- ns = gtod->nsec_base;
+ ns = gtod->snsec_base;
ns += vgettsc(cycle_now);
ns >>= gtod->clock.shift;
- ns += gtod->boot_ns;
+ ns += gtod->nsec_base;
} while (unlikely(read_seqcount_retry(>od->seq, seq)));
*t = ns;
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists