lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 14 Nov 2022 15:28:24 -0800
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     tglx@...utronix.de
Cc:     linux-kernel@...r.kernel.org, john.stultz@...aro.org,
        sboyd@...nel.org, corbet@....net, Mark.Rutland@....com,
        maz@...nel.org, kernel-team@...a.com, neeraju@...eaurora.org,
        ak@...ux.intel.com, feng.tang@...el.com, zhengjun.xing@...el.com,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Chris Mason <clm@...a.com>, John Stultz <jstultz@...gle.com>,
        Waiman Long <longman@...hat.com>
Subject: [PATCH clocksource 1/3] clocksource: Reject bogus watchdog clocksource measurements

One remaining clocksource-skew issue involves extreme CPU overcommit,
which can cause the clocksource watchdog measurements to be delayed by
tens of seconds.  This in turn means that a clock-skew criterion that
is appropriate for a 500-millisecond interval will instead give lots of
false positives.

Therefore, check for the watchdog clocksource reporting much larger or
much less than the time specified by WATCHDOG_INTERVAL.  In these cases,
print a pr_warn() warning and refrain from marking the clocksource under
test as being unstable.

Reported-by: Chris Mason <clm@...a.com>
Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
Cc: John Stultz <jstultz@...gle.com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Stephen Boyd <sboyd@...nel.org>
Cc: Feng Tang <feng.tang@...el.com>
Cc: Waiman Long <longman@...hat.com>
---
 kernel/time/clocksource.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 8058bec87acee..dcaf38c062161 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -386,7 +386,7 @@ EXPORT_SYMBOL_GPL(clocksource_verify_percpu);
 
 static void clocksource_watchdog(struct timer_list *unused)
 {
-	u64 csnow, wdnow, cslast, wdlast, delta;
+	u64 csnow, wdnow, cslast, wdlast, delta, wdi;
 	int next_cpu, reset_pending;
 	int64_t wd_nsec, cs_nsec;
 	struct clocksource *cs;
@@ -440,6 +440,17 @@ static void clocksource_watchdog(struct timer_list *unused)
 		if (atomic_read(&watchdog_reset_pending))
 			continue;
 
+		/* Check for bogus measurements. */
+		wdi = jiffies_to_nsecs(WATCHDOG_INTERVAL);
+		if (wd_nsec < (wdi >> 2)) {
+			pr_warn("timekeeping watchdog on CPU%d: Watchdog clocksource '%s' advanced only %lld ns during %d-jiffy time interval, skipping watchdog check.\n", smp_processor_id(), watchdog->name, wd_nsec, WATCHDOG_INTERVAL);
+			continue;
+		}
+		if (wd_nsec > (wdi << 2)) {
+			pr_warn("timekeeping watchdog on CPU%d: Watchdog clocksource '%s' advanced an excessive %lld ns during %d-jiffy time interval, probable CPU overutilization, skipping watchdog check.\n", smp_processor_id(), watchdog->name, wd_nsec, WATCHDOG_INTERVAL);
+			continue;
+		}
+
 		/* Check the deviation from the watchdog clocksource. */
 		md = cs->uncertainty_margin + watchdog->uncertainty_margin;
 		if (abs(cs_nsec - wd_nsec) > md) {
-- 
2.31.1.189.g2e36527f23

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ