lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aIiZ2_DnJ3u6IINZ@incl>
Date: Tue, 29 Jul 2025 11:52:27 +0200
From: Jiri Wiesner <jwiesner@...e.de>
To: Dimitri Sivanich <sivanich@....com>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Jonathan Corbet <corbet@....net>, Steve Wahl <steve.wahl@....com>,
	Justin Ernst <justin.ernst@....com>,
	Kyle Meyer <kyle.meyer@....com>,
	Dimitri Sivanich <dimitri.sivanich@....com>,
	Russ Anderson <russ.anderson@....com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH] x86: UV RTC: Add parameter to disable RTC clocksource

I apologize for the lateness of my reply. I need to fix my mail filtering.

On Thu, Jul 17, 2025 at 03:50:11PM -0500, Dimitri Sivanich wrote:
> However, while the HPET may seem like a viable backup clocksource for purposes
> of watchdog checking, it won't scale when assigned as an actual clocksource.

Agreed.

> The UV RTC when used as an actual clocksource is more scalable than the HPET,
> but it does have higher access latency than the TSC. TSC provides the low
> access latency clocksource needed by many applications.

Agreed.

> HPE UV hardware is designed to have a reliable and synchronized TSC mechanism.  
> Comparing the TSC against these secondary clocksources can result in false
> positives due to variable access latency caused by system traffic.  The best
> course of action against these false positives has been found to simply disable
> watchdog checking of the TSC.  Currently we recommend that customers apply
> 'tsc=nowatchdog' to the kernel command line.

This is what we (SUSE) have instructed our customer to do in this case. But I thought we could do slightly better by disabling the UV RTC and allowing the checks to continue. The HPET is certainly not without flaws and we have reports of cases where the TSC was incorrectly marked unstable when the HPET was being used for verification. Recently, changes were merged that relaxed the thresholds substantially (our reports predate these changes):
> v6.8-rc5-23-g2ed08e4bc532 clocksource: Scale the watchdog read retries automatically
> v6.11-rc1-5-g4ac1dd3245b9 clocksource: Set cs_watchdog_read() checks based on .uncertainty_margin
The thresholds used to be 125,000 ns and 62,500 ns for the hpet-tsc-hpet read-back delay and the hpet-hpet read-back delay, respectively. Now, it is 750,000 ns and 500,000 ns for the hpet-tsc-hpet read-back delay and the hpet-hpet read-back delay, respectively. The relaxed threholds will make most of the false positives (TSC marked unstable) disappear. But I think fixed thresholds are not optimal. I would rather see the thresholds be derived from previous measuments, e.g. the threshold value could be derived form the maximum hpet-hpet read-back delay that has been measured by the clocksource watchdog.

> commit b50db7095fe002fa3e16605546cba66bf1b68a3e
> Author: Feng Tang <feng.79.tang@...il.com>
> Date:   Wed Nov 17 10:37:51 2021 +0800
> 
>     x86/tsc: Disable clocksource watchdog for TSC on qualified platorms
> 
> commit 233756a640be811efae33763db718fe29753b1e9
> Author: Feng Tang <feng.79.tang@...il.com>
> Date:   Wed Jun 7 15:54:33 2023 +0800
> 
>     x86/tsc: Extend watchdog check exemption to 4-Sockets platform

These two patches have been backported to all SLES releases that include the tightened thresholds for clocksources watchdog checks:
> v5.13-rc4-23-gdb3a34e17433 clocksource: Retry clock read if long delays detected
> v5.13-rc4-26-g2e27e793e280 clocksource: Reduce clocksource-skew threshold
> v5.17-rc1-2-gfc153c1c58cb clocksource: Add a Kconfig option for WATCHDOG_MAX_SKEW
> v6.2-rc1-2-gc37e85c135ce clocksource: Loosen clocksource watchdog constraints

> commit b4bac279319d3082eb42f074799c7b18ba528c71
> Author: Feng Tang <feng.79.tang@...il.com>
> Date:   Mon Jul 29 10:12:02 2024 +0800
> 
>     x86/tsc: Use topology_max_packages() to get package number

We could not backport this patch because the older SLES releases do not contain T. Gleixner's patchset that made topology_max_packages() provide correct package count during early boot.

> Going forward, we will likely submit a patch that disables clocksource watchdog
> checking for newer UV systems in the kernel as well.

My impression is that the clocksource watchdog has mostly outlived its usefulness. I am aware of three occasions where the switch to the HPET caused by the clocksource watchdog notified customers of a serious issue on their system. The first occasion was a hardware issue involving the CPU not executing instructions for hundreds of microseconds but the counters reflecting the passage of time were still incrementing (as if the CPU "stuttered"). The other two occasions are still under investigation.

If I understand correctly the point is that it would be more valuable to have the UV RTC available was a clocksource and avoid it being used by the clocksource watchdog for verifying the TSC. If the UV RTC was used as a clocksource its time skew might become problematic. The largest time skew observed on the 8 socket UV machine was around 700 microseconds per 0.5 second, which is beyond what NTP can correct:
> clocksource: timekeeping watchdog on CPU118: Marking clocksource 'tsc' as unstable because the skew is too large:
> clocksource: 'sgi_rtc' wd_nsec: 511302794 wd_now: 1cb50e4c4b wd_last: 1ca7097111 mask: ffffffffffffff
> clocksource: 'hpet' wd2_nsec: 512005960 wd2_now: 65892719 wd2_last: 64c5d684 mask: ffffffff
> clocksource: 'tsc' cs_nsec: 512006458 cs_now: 86b5982cb1 cs_last: 867581bbab mask: ffffffffffffffff
> clocksource: Clocksource 'tsc' skewed 703664 ns (0 ms) over watchdog 'sgi_rtc' interval of 511302794 ns (511 ms)

> clocksource: timekeeping watchdog on CPU118: Marking clocksource 'tsc' as unstable because the skew is too large:
> clocksource: 'sgi_rtc' wd_nsec: 511302198 wd_now: 1b1cdebaa0 wd_last: 1b0ed9e078 mask: ffffffffffffff
> clocksource: 'tsc' cs_nsec: 512005312 cs_now: 7f746eabdd cs_last: 7f34584009 mask: ffffffffffffffff
> clocksource: Clocksource 'tsc' skewed 703114 ns (0 ms) over watchdog 'sgi_rtc' interval of 511302198 ns (511 ms)

If the clocksource watchdog was disabled by default on newer UV systems it would resolve the issue for us.
-- 
Jiri Wiesner
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ