[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250116162525.GFZ4ky9TdSn7jltgw7@fat_crate.local>
Date: Thu, 16 Jan 2025 17:25:25 +0100
From: Borislav Petkov <bp@...en8.de>
To: Sean Christopherson <seanjc@...gle.com>
Cc: "Nikunj A. Dadhania" <nikunj@....com>, linux-kernel@...r.kernel.org,
thomas.lendacky@....com, x86@...nel.org, kvm@...r.kernel.org,
mingo@...hat.com, tglx@...utronix.de, dave.hansen@...ux.intel.com,
pgonda@...gle.com, pbonzini@...hat.com, francescolavra.fl@...il.com,
Alexey Makhalov <alexey.makhalov@...adcom.com>,
Juergen Gross <jgross@...e.com>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>
Subject: Re: [PATCH v16 12/13] x86/tsc: Switch to native sched clock
On Wed, Jan 15, 2025 at 01:37:25PM -0800, Sean Christopherson wrote:
> My strong vote is prefer TSC over kvmclock for sched_clock if the TSC is constant,
> nonstop, and not marked stable via command line.
So how do we deal with the case here where a malicious HV could set those bits
and still tweak the TSC?
IOW, I think Secure TSC and TDX should be the only ones who trust the TSC when
in a guest.
If anything, trusting the TSC in a normal guest should at least issue
a warning saying that we do use the TSC but there's no 100% guarantee that it
is trustworthy...
> But wait, there's more! Because TDX doesn't override .calibrate_tsc() or
> .calibrate_cpu(), even though TDX provides a trusted TSC *and* enumerates the
> frequency of the TSC, unless I'm missing something, tsc_early_init() will compute
> the TSC frequency using the information provided by KVM, i.e. the untrusted host.
Yeah, I guess we don't want that. Or at least we should warn about it.
> + /*
> + * If the TSC counts at a constant frequency across P/T states, counts
> + * in deep C-states, and the TSC hasn't been marked unstable, prefer
> + * the TSC over kvmclock for sched_clock and drop kvmclock's rating so
> + * that TSC is chosen as the clocksource. Note, the TSC unstable check
> + * exists purely to honor the TSC being marked unstable via command
> + * line, any runtime detection of an unstable will happen after this.
> + */
> + if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
> + boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
> + !check_tsc_unstable()) {
> + kvm_clock.rating = 299;
> + pr_warn("kvm-clock: Using native sched_clock\n");
The warn is in the right direction but probably should say TSC still cannot be
trusted 100%.
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index 0864b314c26a..9baffb425386 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -663,7 +663,12 @@ unsigned long native_calibrate_tsc(void)
> unsigned int eax_denominator, ebx_numerator, ecx_hz, edx;
> unsigned int crystal_khz;
>
> - if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
> + /*
> + * Ignore the vendor when running as a VM, if the hypervisor provides
> + * garbage CPUID information then the vendor is also suspect.
> + */
> + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL &&
> + !boot_cpu_has(X86_FEATURE_HYPERVISOR))
> return 0;
>
> if (boot_cpu_data.cpuid_level < 0x15)
> @@ -713,10 +718,13 @@ unsigned long native_calibrate_tsc(void)
> return 0;
>
> /*
> - * For Atom SoCs TSC is the only reliable clocksource.
> - * Mark TSC reliable so no watchdog on it.
> + * For Atom SoCs TSC is the only reliable clocksource. Similarly, in a
> + * VM, any watchdog is going to be less reliable than the TSC as the
> + * watchdog source will be emulated in software. In both cases, mark
> + * the TSC reliable so that no watchdog runs on it.
> */
> - if (boot_cpu_data.x86_vfm == INTEL_ATOM_GOLDMONT)
> + if (boot_cpu_has(X86_FEATURE_HYPERVISOR) ||
> + boot_cpu_data.x86_vfm == INTEL_ATOM_GOLDMONT)
> setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
>
> #ifdef CONFIG_X86_LOCAL_APIC
It looks all wrong if a function called native_* sprinkles a bunch of "am
I a guest" checks. I guess we should split it into VM and native variants.
But yeah, the general direction is ok once we agree on what we do when
exactly.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists