linux-kernel - Re: [PATCH] clocksource: Use same threshold for read-back delay and time skew

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <cccef2a0-9267-47c1-871a-f3a1a9d9678d@redhat.com>
Date: Fri, 16 Jan 2026 15:05:52 -0500
From: Waiman Long <llong@...hat.com>
To: Jiri Wiesner <jwiesner@...e.de>,
 Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Cc: "Paul E. McKenney" <paulmck@...nel.org>, Feng Tang <feng.tang@...el.com>,
 John Stultz <jstultz@...gle.com>, Thomas Gleixner <tglx@...nel.org>,
 Stephen Boyd <sboyd@...nel.org>
Subject: Re: [PATCH] clocksource: Use same threshold for read-back delay and
 time skew

On 1/16/26 1:08 PM, Jiri Wiesner wrote:
> The idea behind using the uncertainty margins of clocksources to set the
> threshold for checking read-back delay is a sound one. For 3 clocksource
> readouts, 3 uncertainty margins are added up to calcute a read-back delay
> threshold. The read-back delay threshold:
> 2 * watchdog->uncertainty_margin + cs->uncertainty_margin
> is always larger than the threshold for evaluating time skew:
> watchdog->uncertainty_margin + cs->uncertainty_margin
>
> When an SMI or NMI is received between the first watchdog read and the
> clocksource read and servicing this interrupt takes more time than the
> threshold for evaluating time skew but also less time than read-back delay
> threshold, the readouts will pass the read-back delay threshold but the
> clocksource watchdog will mark the clocksource unstable on account of time
> skew. This decision is incorrect. Instead, another iteration of the loop
> in cs_watchdog_read() should execute the readouts again (should there be
> any iterations left).
>
> Correct the situation by decreasing the read-back delay threshold to a
> value equal to the time skew threshold. This patch resolves an issue
> observed on a production machine:
> clocksource: timekeeping watchdog on CPU556: Marking clocksource 'tsc' as unstable because the skew is too large:
> clocksource: 'hpet' wd_nsec: 479980680 wd_now: 1d86c0e1 wd_last: 1ccfa7c4 mask: ffffffff
> clocksource: 'tsc' cs_nsec: 480489826 cs_now: 9d18bdd2c71b5 cs_last: 9d18ba107dfe9 mask: ffffffffffffffff
> clocksource: Clocksource 'tsc' skewed 509146 ns (0 ms) over watchdog 'hpet' interval of 479980680 ns (479 ms)
> clocksource: 'tsc' is current clocksource.
> tsc: Marking TSC unstable due to clocksource watchdog
>
> Fixes: 4ac1dd3245b9 ("clocksource: Set cs_watchdog_read() checks based on .uncertainty_margin")
> Signed-off-by: Jiri Wiesner <jwiesner@...e.de>
> ---
>   kernel/time/clocksource.c | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index a1890a073196..31add3e40c12 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -252,7 +252,6 @@ enum wd_read_status {
>   
>   static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow)
>   {
> -	int64_t md = 2 * watchdog->uncertainty_margin;
>   	unsigned int nretries, max_retries;
>   	int64_t wd_delay, wd_seq_delay;
>   	u64 wd_end, wd_end2;
> @@ -267,7 +266,8 @@ static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow,
>   		local_irq_enable();
>   
>   		wd_delay = cycles_to_nsec_safe(watchdog, *wdnow, wd_end);
> -		if (wd_delay <= md + cs->uncertainty_margin) {
> +		if (wd_delay <= watchdog->uncertainty_margin
> +				+ cs->uncertainty_margin) {
>   			if (nretries > 1 && nretries >= max_retries) {
>   				pr_warn("timekeeping watchdog on CPU%d: %s retried %d times before success\n",
>   					smp_processor_id(), watchdog->name, nretries);
> @@ -280,12 +280,12 @@ static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow,
>   		 * there is too much external interferences that cause
>   		 * significant delay in reading both clocksource and watchdog.
>   		 *
> -		 * If consecutive WD read-back delay > md, report
> -		 * system busy, reinit the watchdog and skip the current
> -		 * watchdog test.
> +		 * If consecutive WD read-back delay is greater than
> +		 * 2 * watchdog->uncertainty_margin, report system busy,
> +		 * reinit the watchdog and skip the current watchdog test.
>   		 */
>   		wd_seq_delay = cycles_to_nsec_safe(watchdog, wd_end, wd_end2);
> -		if (wd_seq_delay > md)
> +		if (wd_seq_delay > 2 * watchdog->uncertainty_margin)
>   			goto skip_test;
>   	}
>   

I think wd_seq_delay should compare to just one 
watchdog->uncertainty_margin. See my comment on a related patch sent out 
by Thomas.

https://lore.kernel.org/lkml/8178bbf3-4a9d-4add-93bb-2ebd4dc03e9f@redhat.com/

Cheers,
Longman