lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <e16267c8edefe2e53bea1de510330e70c06d245a.1438795461.git.shli@fb.com>
Date:	Wed, 5 Aug 2015 11:12:53 -0700
From:	Shaohua Li <shli@...com>
To:	<linux-kernel@...r.kernel.org>
CC:	<Kernel-team@...com>, John Stultz <john.stultz@...aro.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...nel.org>
Subject: [PATCH 1/2] clocksource: improve unstable clocksource detection

>From time to time we saw TSC is marked as unstable in our systems, while
the CPUs declare to have stable TSC. Looking at the clocksource unstable
detection, there are two problems:
- watchdog clock source wrap. HPET is the most common watchdog clock
  source. It's 32-bit and runs in 14.3Mhz. That means the hpet counter
  can wrap in about 5 minutes.
- threshold isn't scaled against interval. The threshold is 0.0625s in
  0.5s interval. What if the actual interval is bigger than 0.5s?

The watchdog runs in a timer bh, so hard/soft irq can defer its running.
Heavy network stack softirq can hog a cpu. IPMI driver can disable
interrupt for a very long time. The first problem is mostly we are
suffering I think.

Here is a simple patch to fix the issues. If the waterdog doesn't run
for a long time, we ignore the detection. This should work for the two
problems. For the second one, we probably doen't need to scale if the
interval isn't very long.

Cc: John Stultz <john.stultz@...aro.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Ingo Molnar <mingo@...nel.org>
Signed-off-by: Shaohua Li <shli@...com>
---
 kernel/time/clocksource.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 841b72f..8417c83 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -122,9 +122,10 @@ static int clocksource_watchdog_kthread(void *data);
 static void __clocksource_change_rating(struct clocksource *cs, int rating);
 
 /*
- * Interval: 0.5sec Threshold: 0.0625s
+ * Interval: 0.5sec MaxInterval: 1s Threshold: 0.0625s
  */
 #define WATCHDOG_INTERVAL (HZ >> 1)
+#define WATCHDOG_MAX_INTERVAL_NS (NSEC_PER_SEC)
 #define WATCHDOG_THRESHOLD (NSEC_PER_SEC >> 4)
 
 static void clocksource_watchdog_work(struct work_struct *work)
@@ -217,7 +218,9 @@ static void clocksource_watchdog(unsigned long data)
 			continue;
 
 		/* Check the deviation from the watchdog clocksource. */
-		if ((abs(cs_nsec - wd_nsec) > WATCHDOG_THRESHOLD)) {
+		if ((abs(cs_nsec - wd_nsec) > WATCHDOG_THRESHOLD) &&
+		    cs_nsec < WATCHDOG_MAX_INTERVAL_NS &&
+		    wd_nsec < WATCHDOG_MAX_INTERVAL_NS) {
 			pr_warn("timekeeping watchdog: Marking clocksource '%s' as unstable because the skew is too large:\n",
 				cs->name);
 			pr_warn("                      '%s' wd_now: %llx wd_last: %llx mask: %llx\n",
-- 
1.8.5.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ