lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <aWp-kKp6PkBsRxMY@incl>
Date: Fri, 16 Jan 2026 19:08:16 +0100
From: Jiri Wiesner <jwiesner@...e.de>
To: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Cc: "Paul E. McKenney" <paulmck@...nel.org>,
	Feng Tang <feng.tang@...el.com>, Waiman Long <longman@...hat.com>,
	John Stultz <jstultz@...gle.com>, Thomas Gleixner <tglx@...nel.org>,
	Stephen Boyd <sboyd@...nel.org>
Subject: [PATCH] clocksource: Use same threshold for read-back delay and time
 skew

The idea behind using the uncertainty margins of clocksources to set the
threshold for checking read-back delay is a sound one. For 3 clocksource
readouts, 3 uncertainty margins are added up to calcute a read-back delay
threshold. The read-back delay threshold:
2 * watchdog->uncertainty_margin + cs->uncertainty_margin
is always larger than the threshold for evaluating time skew:
watchdog->uncertainty_margin + cs->uncertainty_margin

When an SMI or NMI is received between the first watchdog read and the
clocksource read and servicing this interrupt takes more time than the
threshold for evaluating time skew but also less time than read-back delay
threshold, the readouts will pass the read-back delay threshold but the
clocksource watchdog will mark the clocksource unstable on account of time
skew. This decision is incorrect. Instead, another iteration of the loop
in cs_watchdog_read() should execute the readouts again (should there be
any iterations left).

Correct the situation by decreasing the read-back delay threshold to a
value equal to the time skew threshold. This patch resolves an issue
observed on a production machine:
clocksource: timekeeping watchdog on CPU556: Marking clocksource 'tsc' as unstable because the skew is too large:
clocksource: 'hpet' wd_nsec: 479980680 wd_now: 1d86c0e1 wd_last: 1ccfa7c4 mask: ffffffff
clocksource: 'tsc' cs_nsec: 480489826 cs_now: 9d18bdd2c71b5 cs_last: 9d18ba107dfe9 mask: ffffffffffffffff
clocksource: Clocksource 'tsc' skewed 509146 ns (0 ms) over watchdog 'hpet' interval of 479980680 ns (479 ms)
clocksource: 'tsc' is current clocksource.
tsc: Marking TSC unstable due to clocksource watchdog

Fixes: 4ac1dd3245b9 ("clocksource: Set cs_watchdog_read() checks based on .uncertainty_margin")
Signed-off-by: Jiri Wiesner <jwiesner@...e.de>
---
 kernel/time/clocksource.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index a1890a073196..31add3e40c12 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -252,7 +252,6 @@ enum wd_read_status {
 
 static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow)
 {
-	int64_t md = 2 * watchdog->uncertainty_margin;
 	unsigned int nretries, max_retries;
 	int64_t wd_delay, wd_seq_delay;
 	u64 wd_end, wd_end2;
@@ -267,7 +266,8 @@ static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow,
 		local_irq_enable();
 
 		wd_delay = cycles_to_nsec_safe(watchdog, *wdnow, wd_end);
-		if (wd_delay <= md + cs->uncertainty_margin) {
+		if (wd_delay <= watchdog->uncertainty_margin
+				+ cs->uncertainty_margin) {
 			if (nretries > 1 && nretries >= max_retries) {
 				pr_warn("timekeeping watchdog on CPU%d: %s retried %d times before success\n",
 					smp_processor_id(), watchdog->name, nretries);
@@ -280,12 +280,12 @@ static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow,
 		 * there is too much external interferences that cause
 		 * significant delay in reading both clocksource and watchdog.
 		 *
-		 * If consecutive WD read-back delay > md, report
-		 * system busy, reinit the watchdog and skip the current
-		 * watchdog test.
+		 * If consecutive WD read-back delay is greater than
+		 * 2 * watchdog->uncertainty_margin, report system busy,
+		 * reinit the watchdog and skip the current watchdog test.
 		 */
 		wd_seq_delay = cycles_to_nsec_safe(watchdog, wd_end, wd_end2);
-		if (wd_seq_delay > md)
+		if (wd_seq_delay > 2 * watchdog->uncertainty_margin)
 			goto skip_test;
 	}
 
-- 
2.51.0


-- 
Jiri Wiesner
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ