[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aKdzH2b8ShTVeWhx@google.com>
Date: Thu, 21 Aug 2025 12:27:27 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: David Woodhouse <dwmw2@...radead.org>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, Vitaly Kuznetsov <vkuznets@...hat.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, graf@...zon.de,
Ajay Kaher <ajay.kaher@...adcom.com>, Alexey Makhalov <alexey.makhalov@...adcom.com>,
Colin Percival <cperciva@...snap.com>
Subject: Re: [PATCH v2 0/3] Support "generic" CPUID timing leaf as KVM guest
and host
On Thu, Aug 21, 2025, David Woodhouse wrote:
> On Thu, 2025-08-21 at 09:26 -0700, Sean Christopherson wrote:
> > On Sat, Aug 16, 2025, David Woodhouse wrote:
> > > In https://lkml.org/lkml/2008/10/1/246 VMware proposed a generic standard
> > > for harmonising CPUID between hypervisors. It was mostly shot down in
> > > flames, but the generic timing leaf at 0x4000_0010 didn't quite die.
> > >
> > > Mostly the hypervisor leaves at 0x4000_0xxx are very hypervisor-specific,
> > > but XNU and FreeBSD as guests will look for 0x4000_0010 unconditionally,
> > > under any hypervisor. The EC2 Nitro hypervisor has also exposed TSC
> > > frequency information in this leaf, since 2020.
> > >
> > > As things stand, KVM guests have to reverse-calculate the TSC frequency
> > > from the mul/shift information given to them in the KVM clock to convert
> > > ticks into nanoseconds, with a corresponding loss of precision.
> >
> > I would rather have the VMM use the Intel-define CPUID.0x15 to enumerate the
> > TSC frequency.
>
> The problem with that is that it's been quite unreliable. The kernel
> doesn't trust it even on chips as recent (hah) as Skylake. I'd be
> happier to trust what the hypervisor explicitly gives us. But yes, it
> should be *one* of the sources of information before we reverse-
> calculate it from the pvclock.
Sorry, by "the VMM use" I mean have the host, e.g. QEMU, explicitly define TSC
frequency in CPUID.0x15 and CPU frequency in CPUID.0x16. And then on the
KVM-as-a-guest side of things, trust those leaves when they're available.
So same idea as having the VMM fill 0x4000_0010, but piggyback the Intel-defined
leaves instead of the VMware-defined leaf. One of the reasons I'd like to go
that route is to avoid having to choose one or the other when running under TDX,
where CPUID.{0x15,0x16} are provided by the "trusted" TDX-Module, but any PV
leaf is not.
Dunno how feasible it is to get non-Linux guests on board though...
Powered by blists - more mailing lists