lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231101111621.GC19106@noisy.programming.kicks-ass.net>
Date:   Wed, 1 Nov 2023 12:16:21 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     x86@...nel.org
Cc:     linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
        Arjan van de Ven <arjan@...ux.intel.com>,
        feng.tang@...el.com
Subject: [PATCH v2] x86/tsc: Have tsc=recalibrate override things

Subject: x86/tsc: Have tsc=recalibrate override things
From: Peter Zijlstra <peterz@...radead.org>
Date: Mon, 30 Oct 2023 17:00:50 +0100

My brand-spanking new SPR supermicro workstation was reporting NTP
failures:

Oct 30 13:00:26 spr ntpd[3517]: CLOCK: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
Oct 30 13:00:58 spr ntpd[3517]: CLOCK: time stepped by 32.316775
Oct 30 13:00:58 spr ntpd[3517]: CLOCK: frequency error 41699 PPM exceeds tolerance 500 PPM

CPUID provides:

    Time Stamp Counter/Core Crystal Clock Information (0x15):
       TSC/clock ratio = 200/2
       nominal core crystal clock = 25000000 Hz
    Processor Frequency Information (0x16):
       Core Base Frequency (MHz) = 0x9c4 (2500)
       Core Maximum Frequency (MHz) = 0x12c0 (4800)
       Bus (Reference) Frequency (MHz) = 0x64 (100)

and the kernel believes this. Since commit a7ec817d5542 ("x86/tsc: Add
option to force frequency recalibration with HW timer") there is the
tsc=recalibrate option, which forces the recalibrate.

This duely reports:

Oct 30 12:42:39 spr kernel: tsc: Warning: TSC freq calibrated by CPUID/MSR differs from what is calibrated by HW timer, please check with vendor!!
Oct 30 12:42:39 spr kernel: tsc: Previous calibrated TSC freq:         2500.000 MHz
Oct 30 12:42:39 spr kernel: tsc: TSC freq recalibrated by [HPET]:         2399.967 MHz

but then continues running at 2500, offering no solace and keeping NTP
upset -- drifting ~30 seconds for every 15 minutes.

Have tsc=recalibrate override the CPUID provided numbers. This makes the
machine usable and keeps NTP 'happy':

Oct 30 16:48:44 spr ntpd[2200]: CLOCK: time stepped by -0.768679

Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
---
 arch/x86/kernel/tsc.c |   15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1430,14 +1430,13 @@ static void tsc_refine_calibration_work(
 			hpet ? "HPET" : "PM_TIMER",
 			(unsigned long)freq / 1000,
 			(unsigned long)freq % 1000);
+	} else {
 
-		return;
+		/* Make sure we're within 1% */
+		if (abs(tsc_khz - freq) > tsc_khz/100)
+			goto out;
 	}
 
-	/* Make sure we're within 1% */
-	if (abs(tsc_khz - freq) > tsc_khz/100)
-		goto out;
-
 	tsc_khz = freq;
 	pr_info("Refined TSC clocksource calibration: %lu.%03lu MHz\n",
 		(unsigned long)tsc_khz / 1000,
@@ -1479,14 +1478,12 @@ static int __init init_tsc_clocksource(v
 	 * When TSC frequency is known (retrieved via MSR or CPUID), we skip
 	 * the refined calibration and directly register it as a clocksource.
 	 */
-	if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ)) {
+	if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ) && !tsc_force_recalibrate) {
 		if (boot_cpu_has(X86_FEATURE_ART))
 			art_related_clocksource = &clocksource_tsc;
 		clocksource_register_khz(&clocksource_tsc, tsc_khz);
 		clocksource_unregister(&clocksource_tsc_early);
-
-		if (!tsc_force_recalibrate)
-			return 0;
+		return 0;
 	}
 
 	schedule_delayed_work(&tsc_irqwork, 0);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ