lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 9 May 2022 12:58:39 +0800
From:   Feng Tang <feng.tang@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...el.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Jonathan Corbet <corbet@....net>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Cc:     paulmck@...nel.org, rui.zhang@...el.com, len.brown@...el.com,
        tim.c.chen@...el.com
Subject: Re: [PATCH] x86/tsc: Add option to force HW timer based recalibration

Sorry, just spotted some typos, here is the updated version


>From ee8e3d772c623d27d79c43da5a76fb6252175aba Mon Sep 17 00:00:00 2001
From: Feng Tang <feng.tang@...el.com>
Date: Sun, 8 May 2022 20:22:12 +0800
Subject: [PATCH] x86/tsc: Add option to force HW timer based recalibration

Currently when HW provides the tsc freq info through MSR or CPUID(0x15),
the info will be taken as the 'best guess', and kernel will set the
X86_FEATURE_TSC_KNOWN_FREQ flag and skip the HW timer based recalibration,
which works pretty well.

And there is still very few corner case that the freq info is not
accurate enough with small deviation from the actual value, like on
a product with early buggy version of firmware or on some
pre-production hardware.

Add an option 'recalibrate' for 'tsc' kernel parameter to force the
tsc freq recalibration with HPET/PM_TIMER, and warn if the deviation
from previous value is more than about 500 PPM.

Signed-off-by: Feng Tang <feng.tang@...el.com>
---
 .../admin-guide/kernel-parameters.txt         |  5 +++
 arch/x86/kernel/tsc.c                         | 34 ++++++++++++++++---
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 3f1cc5e317ed..1e06196a591e 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5981,6 +5981,11 @@
 			in situations with strict latency requirements (where
 			interruptions from clocksource watchdog are not
 			acceptable).
+			[x86] recalibrate: force to do freq recalibration with
+			a HW timer (HPET or PM_TIMER). When HW provides tsc freq
+			info through MSR or CPUID(0x15), kernel will take it as
+			the 'best guess', but there is corner case that the info
+			could be wrong, and need a double check through HW timer.
 
 	tsc_early_khz=  [X86] Skip early TSC calibration and use the given
 			value instead. Useful when the early TSC frequency discovery
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index cafacb2e58cc..5cf62a58754a 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -48,6 +48,8 @@ static DEFINE_STATIC_KEY_FALSE(__use_tsc);
 
 int tsc_clocksource_reliable;
 
+static int __read_mostly tsc_force_recalibrate;
+
 static u32 art_to_tsc_numerator;
 static u32 art_to_tsc_denominator;
 static u64 art_to_tsc_offset;
@@ -303,6 +305,8 @@ static int __init tsc_setup(char *str)
 		mark_tsc_unstable("boot parameter");
 	if (!strcmp(str, "nowatchdog"))
 		no_tsc_watchdog = 1;
+	if (!strcmp(str, "recalibrate"))
+		tsc_force_recalibrate = 1;
 	return 1;
 }
 
@@ -1374,6 +1378,25 @@ static void tsc_refine_calibration_work(struct work_struct *work)
 	else
 		freq = calc_pmtimer_ref(delta, ref_start, ref_stop);
 
+	/* Will hit this only if tsc_force_recalibrate has been set */
+	if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ)) {
+
+		/* Warn if the deviation exceeds 500 ppm */
+		if (abs(tsc_khz - freq) > (tsc_khz >> 11)) {
+			pr_warn("Warning: TSC freq calibrated by CPUID/MSR differs from what is calibrated by HW timer, please check with vendor!!\n");
+			pr_info("Previous calibrated TSC freq:\t %lu.%03lu MHz\n",
+				(unsigned long)tsc_khz / 1000,
+				(unsigned long)tsc_khz % 1000);
+		}
+
+		pr_info("TSC freq recalibrated by [%s]:\t %lu.%03lu MHz\n",
+			hpet ? "HPET" : "PM_TIMER",
+			(unsigned long)freq / 1000,
+			(unsigned long)freq % 1000);
+
+		return;
+	}
+
 	/* Make sure we're within 1% */
 	if (abs(tsc_khz - freq) > tsc_khz/100)
 		goto out;
@@ -1407,8 +1430,10 @@ static int __init init_tsc_clocksource(void)
 	if (!boot_cpu_has(X86_FEATURE_TSC) || !tsc_khz)
 		return 0;
 
-	if (tsc_unstable)
-		goto unreg;
+	if (tsc_unstable) {
+		clocksource_unregister(&clocksource_tsc_early);
+		return 0;
+	}
 
 	if (boot_cpu_has(X86_FEATURE_NONSTOP_TSC_S3))
 		clocksource_tsc.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP;
@@ -1421,9 +1446,10 @@ static int __init init_tsc_clocksource(void)
 		if (boot_cpu_has(X86_FEATURE_ART))
 			art_related_clocksource = &clocksource_tsc;
 		clocksource_register_khz(&clocksource_tsc, tsc_khz);
-unreg:
 		clocksource_unregister(&clocksource_tsc_early);
-		return 0;
+
+		if (!tsc_force_recalibrate)
+			return 0;
 	}
 
 	schedule_delayed_work(&tsc_irqwork, 0);
-- 
2.27.0


Thanks,
Feng

On Sun, May 08, 2022 at 10:47:33PM +0800, Feng Tang wrote:
> Currently when HW provides the tsc freq info through MSR or CPUID(0x15),
> the info will be taken as the 'best guess', and kernel will set the
> X86_FEATURE_TSC_KNOWN_FREQ flag and skip the HW timer based recalibration,
> which works pretty well.
> 
> And there is still very few corner case that the freq info is not
> accurate enough will small deviation from the actual value, like on
> a product with early version (fix needed) of firmware or some
> pre-production hardware.
> 
> Add an option 'recalibrate' for 'tsc' kernel parameter to force the
> tsc freq recalibration with HPET/PM_TIMER, and warn if the deviation
> from previous value is more than about 500 PPM.
> 
> Signed-off-by: Feng Tang <feng.tang@...el.com>
> ---
>  .../admin-guide/kernel-parameters.txt         |  5 +++
>  arch/x86/kernel/tsc.c                         | 35 ++++++++++++++++---
>  2 files changed, 36 insertions(+), 4 deletions(-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ