[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250814120237.2469583-4-dwmw2@infradead.org>
Date: Thu, 14 Aug 2025 12:56:05 +0100
From: David Woodhouse <dwmw2@...radead.org>
To: Sean Christopherson <seanjc@...gle.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
kvm@...r.kernel.org,
linux-kernel@...r.kernel.org,
graf@...zon.de,
Ajay Kaher <ajay.kaher@...adcom.com>,
Alexey Makhalov <alexey.makhalov@...adcom.com>,
Alok N Kataria <akataria@...are.com>
Subject: [PATCH 3/3] x86/kvm: Obtain TSC frequency from CPUID if present
From: David Woodhouse <dwmw@...zon.co.uk>
In https://lkml.org/lkml/2008/10/1/246 a proposal was made for generic
CPUID conventions across hypervisors. It was mostly shot down in flames,
but the leaf at 0x40000010 containing timing information didn't die.
It's used by XNU and FreeBSD guests under all hypervisors¹² to determine
the TSC frequency, and also exposed by the EC2 Nitro hypervisor (as
well as, presumably, VMware). FreeBSD's Bhyve is probably just about
to start exposing it too.
Use it under KVM to obtain the TSC frequency more accurately, instead
of reverse-calculating the frequency from the mul/shift values in the
KVM clock.
Before:
[ 0.000020] tsc: Detected 2900.014 MHz processor
After:
[ 0.000020] tsc: Detected 2900.015 MHz processor
$ cpuid -1 -l 0x40000010
CPU:
hypervisor generic timing information (0x40000010):
TSC frequency (Hz) = 2900015
bus frequency (Hz) = 1000000
¹ https://github.com/apple/darwin-xnu/blob/main/osfmk/i386/cpuid.c
² https://github.com/freebsd/freebsd-src/commit/4a432614f68
Signed-off-by: David Woodhouse <dwmw@...zon.co.uk>
---
arch/x86/include/asm/kvm_para.h | 1 +
arch/x86/kernel/kvm.c | 10 ++++++++++
arch/x86/kernel/kvmclock.c | 7 ++++++-
3 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 57bc74e112f2..d53927103cab 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -121,6 +121,7 @@ static inline long kvm_sev_hypercall3(unsigned int nr, unsigned long p1,
void kvmclock_init(void);
void kvmclock_disable(void);
bool kvm_para_available(void);
+unsigned int kvm_para_tsc_khz(void);
unsigned int kvm_arch_para_features(void);
unsigned int kvm_arch_para_hints(void);
void kvm_async_pf_task_wait_schedule(u32 token);
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 8ae750cde0c6..1a80f4e5c854 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -896,6 +896,16 @@ bool kvm_para_available(void)
}
EXPORT_SYMBOL_GPL(kvm_para_available);
+unsigned int kvm_para_tsc_khz()
+{
+ u32 base = kvm_cpuid_base();
+
+ if (cpuid_eax(base) >= (base | KVM_CPUID_TIMING_INFO))
+ return cpuid_eax(base | KVM_CPUID_TIMING_INFO);
+
+ return 0;
+}
+
unsigned int kvm_arch_para_features(void)
{
return cpuid_eax(kvm_cpuid_base() | KVM_CPUID_FEATURES);
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index ca0a49eeac4a..0908450ebac9 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -117,7 +117,12 @@ static inline void kvm_sched_clock_init(bool stable)
static unsigned long kvm_get_tsc_khz(void)
{
setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
- return pvclock_tsc_khz(this_cpu_pvti());
+
+ /*
+ * If KVM advertises the frequency directly in CPUID, use that
+ * instead of reverse-calculating it from the KVM clock data.
+ */
+ return kvm_para_tsc_khz() ? : pvclock_tsc_khz(this_cpu_pvti());
}
static void __init kvm_get_preset_lpj(void)
--
2.49.0
Powered by blists - more mailing lists