lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87tu40p3ws.ffs@tglx>
Date:   Wed, 19 Oct 2022 11:18:43 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Feng Tang <feng.tang@...el.com>, Ingo Molnar <mingo@...hat.com>,
        Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...el.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Cc:     rui.zhang@...el.com, tim.c.chen@...el.com,
        Xiongfeng Wang <wangxiongfeng2@...wei.com>,
        Feng Tang <feng.tang@...el.com>, Yu Liao <liaoyu15@...wei.com>
Subject: Re: [PATCH v2] x86/tsc: Extend watchdog check exemption to
 4-Sockets platform

On Thu, Oct 13 2022 at 21:12, Feng Tang wrote:
> There is report again that the tsc clocksource on a 4 sockets x86
> Skylake server was wrongly judged as 'unstable' by 'jiffies' watchdog,
> and disabled [1].
>
> Commit b50db7095fe0 ("x86/tsc: Disable clocksource watchdog for TSC
> on qualified platorms") was introduce to deal with these false
> alarms of tsc unstable issues, covering qualified platforms for 2
> sockets or smaller ones.
>
> Extend the exemption to 4 sockets to fix the issue.
>
> We also got similar reports on 8 sockets platform from internal test,
> but as Peter pointed out, there was tsc sync issues for 8-sockets
> platform, and it'd better be handled architecture by architecture,
> instead of directly changing the threshold to 8 here.
>
> Rui also proposed another way to disable 'jiffies' as clocksource
> watchdog [2], which can also solve this specific problem in an
> architecture independent way, with one limitation that some tsc false
> alarms are reported by other watchdogs like HPET in post-boot time,
> while 'jiffies' is mostly used in boot phase before hardware
> clocksources are initialized.

HPET is initialized early, but if HPET is disabled or not advertised
then the only other hardware clocksource is PMTIMER which is initialized
late via fs_initcall. PMTIMER is initialized late due to broken Pentium
era chipsets which are sorted with PCI quirks. For anything else we can
initialize it early. Something like the below.

I'm sure I said this more than once, but I'm happy to repeat myself
forever:

  Instead of proliferating lousy hacks, can the X86 vendors finaly get
  their act together and provide some architected information whether
  the TSC is trustworthy or not?

Thanks,

        tglx
---

--- a/arch/x86/kernel/time.c
+++ b/arch/x86/kernel/time.c
@@ -10,6 +10,7 @@
  *
  */
 
+#include <linux/acpi_pmtmr.h>
 #include <linux/clocksource.h>
 #include <linux/clockchips.h>
 #include <linux/interrupt.h>
@@ -75,6 +76,14 @@ static void __init setup_default_timer_i
 void __init hpet_time_init(void)
 {
 	if (!hpet_enable()) {
+		/*
+		 * Some Pentium chipsets have broken HPETs and need
+		 * PCI quirks to run before init.
+		 */
+		if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
+		    boot_cpu_data.family != 5)
+			init_acpi_pm_clocksource();
+
 		if (!pit_timer_init())
 			return;
 	}
--- a/drivers/clocksource/acpi_pm.c
+++ b/drivers/clocksource/acpi_pm.c
@@ -30,6 +30,7 @@
  * in arch/i386/kernel/acpi/boot.c
  */
 u32 pmtmr_ioport __read_mostly;
+static bool pmtmr_initialized __init_data;
 
 static inline u32 read_pmtmr(void)
 {
@@ -142,7 +143,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SE
  * Some boards have the PMTMR running way too fast. We check
  * the PMTMR rate against PIT channel 2 to catch these cases.
  */
-static int verify_pmtmr_rate(void)
+static int __init verify_pmtmr_rate(void)
 {
 	u64 value1, value2;
 	unsigned long count, delta;
@@ -172,14 +173,18 @@ static int verify_pmtmr_rate(void)
 /* Number of reads we try to get two different values */
 #define ACPI_PM_READ_CHECKS 10000
 
-static int __init init_acpi_pm_clocksource(void)
+int __init init_acpi_pm_clocksource(void)
 {
 	u64 value1, value2;
 	unsigned int i, j = 0;
+	int ret;
 
 	if (!pmtmr_ioport)
 		return -ENODEV;
 
+	if (pmtmr_initialized)
+		return 0;
+
 	/* "verify" this timing source: */
 	for (j = 0; j < ACPI_PM_MONOTONICITY_CHECKS; j++) {
 		udelay(100 * j);
@@ -210,10 +215,11 @@ static int __init init_acpi_pm_clocksour
 		return -ENODEV;
 	}
 
-	return clocksource_register_hz(&clocksource_acpi_pm,
-						PMTMR_TICKS_PER_SEC);
+	ret = clocksource_register_hz(&clocksource_acpi_pm, PMTMR_TICKS_PER_SEC);
+	if (!ret)
+		pmtimer_initialized = true;
+	return ret;
 }
-
 /* We use fs_initcall because we want the PCI fixups to have run
  * but we still need to load before device_initcall
  */
--- a/include/linux/acpi_pmtmr.h
+++ b/include/linux/acpi_pmtmr.h
@@ -13,6 +13,8 @@
 /* Overrun value */
 #define ACPI_PM_OVRRUN	(1<<24)
 
+extern int __init init_acpi_pm_clocksource(void);
+
 #ifdef CONFIG_X86_PM_TIMER
 
 extern u32 acpi_pm_read_verified(void);


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ