[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aLqX035O0lQEVPrl@google.com>
Date: Fri, 5 Sep 2025 00:57:07 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: David Woodhouse <dwmw2@...radead.org>
Cc: Paul Durrant <pdurrant@...zon.co.uk>, Fred Griffoul <fgriffo@...zon.co.uk>,
Colin Percival <cperciva@...snap.com>, Paolo Bonzini <pbonzini@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, "x86@...nel.org" <x86@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>, Vitaly Kuznetsov <vkuznets@...hat.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "Graf (AWS), Alexander" <graf@...zon.de>,
Ajay Kaher <ajay.kaher@...adcom.com>, Alexey Makhalov <alexey.makhalov@...adcom.com>
Subject: Re: [PATCH v2 0/3] Support "generic" CPUID timing leaf as KVM guest
and host
On Thu, Sep 04, 2025, David Woodhouse wrote:
> On Thu, 2025-09-04 at 06:25 -0700, Sean Christopherson wrote:
> > Anyways, I'm a-ok reporting that information in KVM_GET_SUPPORTED_CPUID (again,
> > only with constant TSC and scaling). Reporting the effective frequency would be
> > useful for the host too, e.g. for sanity checks. What I specifically want to
> > avoid is modifying guest CPUID at runtime.
>
> Hm, in some cases I thought KVM had deliberately moved *to* doing CPUID
> updates at runtime, so that its doesn't have to exempt the changable
> leaves from the sanity checks which prevent userspace from updating
> CPUID for a CPU which has already been run.
Ah, I shouldn't have qualified my statement with "runtime". I don't want KVM
modifying incoming CPUID at all, as KVM's attempts to "help" userspace have
backfired more often than not. The only scenarios where modifying CPUID is ok
is for cases where a change in state architectural affects CPUID output, e.g. on
CR4 or MSR changes.
Moving the Xen CPUID fixup to runtime was essentially the least awful way to deal
with KVM disallowing post-run CPUID changes, the underlying problem is that KVM
was filling Xen CPUID in the first place.
> It's not just the existing Xen TSC leaf which is updated at runtime in
> kvm_cpuid().
>
> But I don't mind too much. If we give userspace a way to *know* the
> effective frequency, I'm OK with requiring that userspace do so and
> populate the corresponding CPUID leaves for itself, for Xen and KVM
> alike. We'd need to expose the FSB frequency too, not just TSC.
>
> I was only going with the runtime update because we are literally
> already *doing* it this way in KVM.
>
> > Hmm, the only wrinkle is that, if there is slop, KVM could report different
> > information when run on different platforms, e.g. after live migration. But so
> > long as that possibility is documented, I don't think it's truly problematic.
> > And it's another argument for not modifying guest CPUID directly; I'd rather let
> > userspace figure out whether or not they care about the divergence than silently
> > change things from the guest's perspective.
> >
> > Alternatively (or in addition to), part of me wants to stealtily update
> > KVM_GET_TSC_KHZ to report back the effective frequency, but I can see that being
> > problematic, e.g. if a naive VMM reads KVM_GET_TSC_KHZ when saving vCPU state for
> > live migration and after enough migrations, the slop ends up drastically skewing
> > the guest's frequency.
>
> Indeed. And I also want to tell userspace the precise *ratio* being
> applied by hardware scaling, for the VMClock case where userspace
> definitely knows *better* about what the host TSC frequency is at this
> precise moment, and has to tell the guest what *its* TSC frequency is,
> with the same precision.
Maybe add the scaled/effective frequency and the ratio information as read-only
TSC attributes, e.g.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7ba2cdfdac44..4ba4c88f3d33 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5788,6 +5788,18 @@ static int kvm_arch_tsc_get_attr(struct kvm_vcpu *vcpu,
break;
r = 0;
break;
+ case KVM_VCPU_TSC_SCALED_KHZ:
+ r = -EFAULT;
+ if (put_user(vcpu->arch.hw_tsc_khz, uaddr))
+ break;
+ r = 0;
+ break;
+ case KVM_VCPU_TSC_SCALED_RATIO:
+ r = -EFAULT;
+ if (put_user(<math>, uaddr))
+ break;
+ r = 0;
+ break;
default:
r = -ENXIO;
}
Powered by blists - more mailing lists