lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c9ce2eb1-bf90-3ce4-0adf-3f4e43f4a5bd@igalia.com>
Date: Sun, 23 Mar 2025 14:53:05 -0300
From: "Guilherme G. Piccoli" <gpiccoli@...lia.com>
To: Thomas Gleixner <tglx@...utronix.de>, "H. Peter Anvin" <hpa@...or.com>,
 bp@...en8.de
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, mingo@...hat.com,
 dave.hansen@...ux.intel.com, kernel@...ccoli.net, kernel-dev@...lia.com
Subject: Re: [PATCH] x86/tsc: Add debugfs entry to mark TSC as unstable after
 boot

Thanks Thomas for your comprehensive response, quite enriching.
Some comments inline:


On 21/03/2025 18:19, Thomas Gleixner wrote:
> [...]
> The proposed implementation is just an ad hoc band aid as well. Why?
> 
>   1) It has zero relation to the actual failure detection code paths.
> 
>   2) It covers only a small part of the problem space. On all modern
>      systems, which have TSC_ADJUST the clocksource watchdog is disabled
>      and just asynchronously invoking TSC unstable is a hack which only
>      tests the unstable logic.

But what about AMD systems? Even the modern ones apparently lack
TSC_ADJUST - or is it changing recently?

Checking TSC code, it is full of checks "if Intel" as well, like in
native calibration. Our issue is present on AMD and my impression is
that, in this respect, these systems are way more unstable (from TSC
perspective) than the ones having TSC_ADJUST.


> 
> So I rather want to see a more complete solution, which
> 
>   1) lets the clocksource watchdog logic fail the test
> 
>   2) lets the TSC sync (including TSC_ADJUST) logic on CPU hotplug fail
> 
>   3) tweaks the TSC_ADJUST register and validates that the detection and
>      mitigation logic on systems w/o clocksource watchdog works
>      correctly.
> 
> Ideally that's a kunit test for CI integration plus a debugfs interface
> for developers, which comes with a related selftest.
> 

This is a great suggestion. I'll try to come up with something in next
weeks (as time allows), I agree this area indeed seems to lack good/easy
testing.
Cheers,


Guilherme

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ