lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250116162525.GFZ4ky9TdSn7jltgw7@fat_crate.local>
Date: Thu, 16 Jan 2025 17:25:25 +0100
From: Borislav Petkov <bp@...en8.de>
To: Sean Christopherson <seanjc@...gle.com>
Cc: "Nikunj A. Dadhania" <nikunj@....com>, linux-kernel@...r.kernel.org,
	thomas.lendacky@....com, x86@...nel.org, kvm@...r.kernel.org,
	mingo@...hat.com, tglx@...utronix.de, dave.hansen@...ux.intel.com,
	pgonda@...gle.com, pbonzini@...hat.com, francescolavra.fl@...il.com,
	Alexey Makhalov <alexey.makhalov@...adcom.com>,
	Juergen Gross <jgross@...e.com>,
	Boris Ostrovsky <boris.ostrovsky@...cle.com>
Subject: Re: [PATCH v16 12/13] x86/tsc: Switch to native sched clock

On Wed, Jan 15, 2025 at 01:37:25PM -0800, Sean Christopherson wrote:
> My strong vote is prefer TSC over kvmclock for sched_clock if the TSC is constant,
> nonstop, and not marked stable via command line.

So how do we deal with the case here where a malicious HV could set those bits
and still tweak the TSC?

IOW, I think Secure TSC and TDX should be the only ones who trust the TSC when
in a guest.

If anything, trusting the TSC in a normal guest should at least issue
a warning saying that we do use the TSC but there's no 100% guarantee that it
is trustworthy...

> But wait, there's more!  Because TDX doesn't override .calibrate_tsc() or
> .calibrate_cpu(), even though TDX provides a trusted TSC *and* enumerates the
> frequency of the TSC, unless I'm missing something, tsc_early_init() will compute
> the TSC frequency using the information provided by KVM, i.e. the untrusted host.

Yeah, I guess we don't want that. Or at least we should warn about it.
 
> +	/*
> +	 * If the TSC counts at a constant frequency across P/T states, counts
> +	 * in deep C-states, and the TSC hasn't been marked unstable, prefer
> +	 * the TSC over kvmclock for sched_clock and drop kvmclock's rating so
> +	 * that TSC is chosen as the clocksource.  Note, the TSC unstable check
> +	 * exists purely to honor the TSC being marked unstable via command
> +	 * line, any runtime detection of an unstable will happen after this.
> +	 */
> +	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
> +	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
> +	    !check_tsc_unstable()) {
> +		kvm_clock.rating = 299;
> +		pr_warn("kvm-clock: Using native sched_clock\n");

The warn is in the right direction but probably should say TSC still cannot be
trusted 100%.

> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index 0864b314c26a..9baffb425386 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -663,7 +663,12 @@ unsigned long native_calibrate_tsc(void)
>  	unsigned int eax_denominator, ebx_numerator, ecx_hz, edx;
>  	unsigned int crystal_khz;
>  
> -	if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
> +	/*
> +	 * Ignore the vendor when running as a VM, if the hypervisor provides
> +	 * garbage CPUID information then the vendor is also suspect.
> +	 */
> +	if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL &&
> +	    !boot_cpu_has(X86_FEATURE_HYPERVISOR))
>  		return 0;
>  
>  	if (boot_cpu_data.cpuid_level < 0x15)
> @@ -713,10 +718,13 @@ unsigned long native_calibrate_tsc(void)
>  		return 0;
>  
>  	/*
> -	 * For Atom SoCs TSC is the only reliable clocksource.
> -	 * Mark TSC reliable so no watchdog on it.
> +	 * For Atom SoCs TSC is the only reliable clocksource.  Similarly, in a
> +	 * VM, any watchdog is going to be less reliable than the TSC as the
> +	 * watchdog source will be emulated in software.  In both cases, mark
> +	 * the TSC reliable so that no watchdog runs on it.
>  	 */
> -	if (boot_cpu_data.x86_vfm == INTEL_ATOM_GOLDMONT)
> +	if (boot_cpu_has(X86_FEATURE_HYPERVISOR) ||
> +	    boot_cpu_data.x86_vfm == INTEL_ATOM_GOLDMONT)
>  		setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
>  
>  #ifdef CONFIG_X86_LOCAL_APIC

It looks all wrong if a function called native_* sprinkles a bunch of "am
I a guest" checks. I guess we should split it into VM and native variants.

But yeah, the general direction is ok once we agree on what we do when
exactly.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ