[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5b905902c99e13d65ea0810b0885fca97cffc74d.camel@infradead.org>
Date: Thu, 21 Aug 2025 21:09:16 +0100
From: David Woodhouse <dwmw2@...radead.org>
To: Sohil Mehta <sohil.mehta@...el.com>, x86@...nel.org, Dave Hansen
<dave.hansen@...ux.intel.com>, Tony Luck <tony.luck@...el.com>,
Jürgen Gross
<jgross@...e.com>, Boris Ostrovsky <boris.ostrovsky@...cle.com>, xen-devel
<xen-devel@...ts.xenproject.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim
<namhyung@...nel.org>, Mark Rutland <mark.rutland@....com>, Alexander
Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa
<jolsa@...nel.org>, Ian Rogers <irogers@...gle.com>, Adrian Hunter
<adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>, Thomas
Gleixner <tglx@...utronix.de>, Borislav Petkov <bp@...en8.de>, "H . Peter
Anvin" <hpa@...or.com>, "Rafael J . Wysocki" <rafael@...nel.org>, Len Brown
<lenb@...nel.org>, Andy Lutomirski <luto@...nel.org>, Viresh Kumar
<viresh.kumar@...aro.org>, Jean Delvare <jdelvare@...e.com>, Guenter Roeck
<linux@...ck-us.net>, Zhang Rui <rui.zhang@...el.com>, Andrew Cooper
<andrew.cooper3@...rix.com>, David Laight <david.laight.linux@...il.com>,
Dapeng Mi <dapeng1.mi@...ux.intel.com>, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-acpi@...r.kernel.org,
linux-pm@...r.kernel.org, kvm@...r.kernel.org, xiaoyao.li@...el.com, Xin
Li <xin@...or.com>
Subject: Re: [PATCH v3 13/15] x86/cpu/intel: Bound the non-architectural
constant_tsc model checks
On Thu, 2025-08-21 at 12:43 -0700, Sohil Mehta wrote:
> On 8/21/2025 12:34 PM, Sohil Mehta wrote:
> > On 8/21/2025 6:15 AM, David Woodhouse wrote:
> >
> > > Hm. My test host is INTEL_HASWELL_X (0x63f). For reasons which are
> > > unclear to me, QEMU doesn't set bit 8 of 0x80000007 EDX unless I
> > > explicitly append ',+invtsc' to the existing '-cpu host' on its command
> > > line. So now my guest doesn't think it has X86_FEATURE_CONSTANT_TSC.
> > >
> >
> > Haswell should have X86_FEATURE_CONSTANT_TSC, so I would have expected
> > the guest bit to be set. Until now, X86_FEATURE_CONSTANT_TSC was set
> > based on the Family-model instead of the CPUID enumeration which may
> > have hid the issue.
> >
>
> Correction:
> s/instead/as well as
>
> > From my initial look at the QEMU implementation, this seems intentional.
> >
> > QEMU considers Invariant TSC as un-migratable which prevents it from
> > being exposed to migratable guests (default).
> > target/i386/cpu.c:
> > [FEAT_8000_0007_EDX]
> > .unmigratable_flags = CPUID_APM_INVTSC,
> >
> > Can you please try '-cpu host,migratable=off'?
>
> This is mainly to verify. If confirmed, I am not sure what the long term
> solution should be.
Yes, explicitly turning it on with -cpu host,+invtsc does work.
I've been looking into why it takes a Xen guest four seconds per vCPU
in this case, but not a KVM guest.
When running as a KVM guest, Linux will infer the TSC frequency from
the KVM clock — or better still, from CPUID; see
https://lore.kernel.org/all/20250816101308.2594298-1-dwmw2@infradead.org
and/or
https://lore.kernel.org/all/20250227021855.3257188-36-seanjc@google.com
As a Xen guest though, Linux doesn't do that. This patch in the guest
should make it work without recalibrating the TSC for each vCPU...
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -489,7 +489,15 @@ static void xen_setup_vsyscall_time_info(void)
*/
static int __init xen_tsc_safe_clocksource(void)
{
- u32 eax, ebx, ecx, edx;
+ u32 eax, ebx, ecx, edx;
+ u64 lpj;
+
+ /* Leaf 4, sub-leaf 0 (0x40000x03) */
+ cpuid_count(xen_cpuid_base() + 3, 0, &eax, &ebx, &ecx, &edx);
+
+ lpj = ((u64)ecx * 1000);
+ do_div(lpj, HZ);
+ preset_lpj = lpj;
if (!(boot_cpu_has(X86_FEATURE_CONSTANT_TSC)))
return 0;
@@ -500,9 +508,6 @@ static int __init xen_tsc_safe_clocksource(void)
if (check_tsc_unstable())
return 0;
- /* Leaf 4, sub-leaf 0 (0x40000x03) */
- cpuid_count(xen_cpuid_base() + 3, 0, &eax, &ebx, &ecx, &edx);
-
return ebx == XEN_CPUID_TSC_MODE_NEVER_EMULATE;
}
... but then I got slightly distracted by the question of why I was
getting *nonsense* in those values, and why KVM is 'correcting' EAX in
subleaf 2 which is supposed to be the *host* TSC, not ECX in subleaf
zero...
Under the Fedora 6.13.8-200 kernel I'm fairly sure the guest was seeing
values in subleaf 0 ECX/EDX that *should* have been in subleaf 1
ECX/EDX, and that problem went away when I rebooted the host into a
mainline kernel. Will have to go back and retest that part...
Download attachment "smime.p7s" of type "application/pkcs7-signature" (5069 bytes)
Powered by blists - more mailing lists