[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221019230904.GA2502730@paulmck-ThinkPad-P17-Gen-1>
Date: Wed, 19 Oct 2022 16:09:04 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: linux-kernel@...r.kernel.org
Cc: clm@...a.com, jstultz@...gle.com, tglx@...utronix.de,
sboyd@...nel.org, feng.tang@...el.com, longman@...hat.com
Subject: [PATCH clocksource] Reject bogus watchdog clocksource measurements
One remaining clocksource-skew issue involves extreme CPU overcommit,
which can cause the clocksource watchdog measurements to be delayed by
tens of seconds. This in turn means that a clock-skew criterion that
is appropriate for a 500-millisecond interval will instead give lots of
false positives.
Therefore, check for the watchdog clocksource reporting much larger or
much less than the time specified by WATCHDOG_INTERVAL. In these cases,
print a pr_warn() warning and refrain from marking the clocksource under
test as being unstable.
Reported-by: Chris Mason <clm@...a.com>
Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
Cc: John Stultz <jstultz@...gle.com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Stephen Boyd <sboyd@...nel.org>
Cc: Feng Tang <feng.tang@...el.com>
Cc: Waiman Long <longman@...hat.com>
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 8058bec87acee..dcaf38c062161 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -386,7 +386,7 @@ EXPORT_SYMBOL_GPL(clocksource_verify_percpu);
static void clocksource_watchdog(struct timer_list *unused)
{
- u64 csnow, wdnow, cslast, wdlast, delta;
+ u64 csnow, wdnow, cslast, wdlast, delta, wdi;
int next_cpu, reset_pending;
int64_t wd_nsec, cs_nsec;
struct clocksource *cs;
@@ -440,6 +440,17 @@ static void clocksource_watchdog(struct timer_list *unused)
if (atomic_read(&watchdog_reset_pending))
continue;
+ /* Check for bogus measurements. */
+ wdi = jiffies_to_nsecs(WATCHDOG_INTERVAL);
+ if (wd_nsec < (wdi >> 2)) {
+ pr_warn("timekeeping watchdog on CPU%d: Watchdog clocksource '%s' advanced only %lld ns during %d-jiffy time interval, skipping watchdog check.\n", smp_processor_id(), watchdog->name, wd_nsec, WATCHDOG_INTERVAL);
+ continue;
+ }
+ if (wd_nsec > (wdi << 2)) {
+ pr_warn("timekeeping watchdog on CPU%d: Watchdog clocksource '%s' advanced an excessive %lld ns during %d-jiffy time interval, probable CPU overutilization, skipping watchdog check.\n", smp_processor_id(), watchdog->name, wd_nsec, WATCHDOG_INTERVAL);
+ continue;
+ }
+
/* Check the deviation from the watchdog clocksource. */
md = cs->uncertainty_margin + watchdog->uncertainty_margin;
if (abs(cs_nsec - wd_nsec) > md) {
Powered by blists - more mailing lists