[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cccef2a0-9267-47c1-871a-f3a1a9d9678d@redhat.com>
Date: Fri, 16 Jan 2026 15:05:52 -0500
From: Waiman Long <llong@...hat.com>
To: Jiri Wiesner <jwiesner@...e.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Cc: "Paul E. McKenney" <paulmck@...nel.org>, Feng Tang <feng.tang@...el.com>,
John Stultz <jstultz@...gle.com>, Thomas Gleixner <tglx@...nel.org>,
Stephen Boyd <sboyd@...nel.org>
Subject: Re: [PATCH] clocksource: Use same threshold for read-back delay and
time skew
On 1/16/26 1:08 PM, Jiri Wiesner wrote:
> The idea behind using the uncertainty margins of clocksources to set the
> threshold for checking read-back delay is a sound one. For 3 clocksource
> readouts, 3 uncertainty margins are added up to calcute a read-back delay
> threshold. The read-back delay threshold:
> 2 * watchdog->uncertainty_margin + cs->uncertainty_margin
> is always larger than the threshold for evaluating time skew:
> watchdog->uncertainty_margin + cs->uncertainty_margin
>
> When an SMI or NMI is received between the first watchdog read and the
> clocksource read and servicing this interrupt takes more time than the
> threshold for evaluating time skew but also less time than read-back delay
> threshold, the readouts will pass the read-back delay threshold but the
> clocksource watchdog will mark the clocksource unstable on account of time
> skew. This decision is incorrect. Instead, another iteration of the loop
> in cs_watchdog_read() should execute the readouts again (should there be
> any iterations left).
>
> Correct the situation by decreasing the read-back delay threshold to a
> value equal to the time skew threshold. This patch resolves an issue
> observed on a production machine:
> clocksource: timekeeping watchdog on CPU556: Marking clocksource 'tsc' as unstable because the skew is too large:
> clocksource: 'hpet' wd_nsec: 479980680 wd_now: 1d86c0e1 wd_last: 1ccfa7c4 mask: ffffffff
> clocksource: 'tsc' cs_nsec: 480489826 cs_now: 9d18bdd2c71b5 cs_last: 9d18ba107dfe9 mask: ffffffffffffffff
> clocksource: Clocksource 'tsc' skewed 509146 ns (0 ms) over watchdog 'hpet' interval of 479980680 ns (479 ms)
> clocksource: 'tsc' is current clocksource.
> tsc: Marking TSC unstable due to clocksource watchdog
>
> Fixes: 4ac1dd3245b9 ("clocksource: Set cs_watchdog_read() checks based on .uncertainty_margin")
> Signed-off-by: Jiri Wiesner <jwiesner@...e.de>
> ---
> kernel/time/clocksource.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index a1890a073196..31add3e40c12 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -252,7 +252,6 @@ enum wd_read_status {
>
> static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow)
> {
> - int64_t md = 2 * watchdog->uncertainty_margin;
> unsigned int nretries, max_retries;
> int64_t wd_delay, wd_seq_delay;
> u64 wd_end, wd_end2;
> @@ -267,7 +266,8 @@ static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow,
> local_irq_enable();
>
> wd_delay = cycles_to_nsec_safe(watchdog, *wdnow, wd_end);
> - if (wd_delay <= md + cs->uncertainty_margin) {
> + if (wd_delay <= watchdog->uncertainty_margin
> + + cs->uncertainty_margin) {
> if (nretries > 1 && nretries >= max_retries) {
> pr_warn("timekeeping watchdog on CPU%d: %s retried %d times before success\n",
> smp_processor_id(), watchdog->name, nretries);
> @@ -280,12 +280,12 @@ static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow,
> * there is too much external interferences that cause
> * significant delay in reading both clocksource and watchdog.
> *
> - * If consecutive WD read-back delay > md, report
> - * system busy, reinit the watchdog and skip the current
> - * watchdog test.
> + * If consecutive WD read-back delay is greater than
> + * 2 * watchdog->uncertainty_margin, report system busy,
> + * reinit the watchdog and skip the current watchdog test.
> */
> wd_seq_delay = cycles_to_nsec_safe(watchdog, wd_end, wd_end2);
> - if (wd_seq_delay > md)
> + if (wd_seq_delay > 2 * watchdog->uncertainty_margin)
> goto skip_test;
> }
>
I think wd_seq_delay should compare to just one
watchdog->uncertainty_margin. See my comment on a related patch sent out
by Thomas.
https://lore.kernel.org/lkml/8178bbf3-4a9d-4add-93bb-2ebd4dc03e9f@redhat.com/
Cheers,
Longman
Powered by blists - more mailing lists