[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87iko213qo.ffs@tglx>
Date: Fri, 21 Mar 2025 22:19:11 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: "Guilherme G. Piccoli" <gpiccoli@...lia.com>, "H. Peter Anvin"
<hpa@...or.com>, bp@...en8.de
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, mingo@...hat.com,
dave.hansen@...ux.intel.com, kernel@...ccoli.net, kernel-dev@...lia.com
Subject: Re: [PATCH] x86/tsc: Add debugfs entry to mark TSC as unstable
after boot
On Fri, Mar 21 2025 at 16:26, Guilherme G. Piccoli wrote:
> On 17/03/2025 15:42, H. Peter Anvin wrote:
>> To be honest I don't think this belongs in debugfs; rather it belongs in sysfs.
No.
>> Debugfs should not be necessarily in serious production systems – it
>> is way too large of an attack surface, which is a very good reason
>> why it is its own filesystem – but if this is a real issue on
>> hardware then it may be needed.
There is ZERO reason to do that on a production system.
If the in kernel detection does not work, then switching it over after
someone detected the problem five hours after the fact does not help at
all.
The admin can force the TSC to be removed from timekeeping, which is the
really crucial part, already today by changing the clocksource via sysfs.
> In other words, we have 2 options in my understanding:
>
> (a) Drop it;
>
> (b) Re-implement using sysfs entry instead of debugfs;
Neither (a) nor (b) nor the proposed implementation.
There is actually a good reason why a debug/validation mechanism of some
sort makes sense, i.e. testing:
1) The hardware, which exposed these issues frequently is starting to
get into museum or junkyard state, which reduces the test base
significantly.
2) Modern hardware, which exposes this issue in large fleets
occasionally due to aging and misdirected neutrons, is not really a
good testbed either.
So we have no real test coverage for something, which can be crucial in
the actual failure case.
Sure, it could be argued that this can be implemented in qemu, but that's
fundamentally the wrong approach.
This is something which must be easily available to developers and CI
and not require to have a special setup with debug nonsense enabled in
some external tool.
The proposed implementation is just an ad hoc band aid as well. Why?
1) It has zero relation to the actual failure detection code paths.
2) It covers only a small part of the problem space. On all modern
systems, which have TSC_ADJUST the clocksource watchdog is disabled
and just asynchronously invoking TSC unstable is a hack which only
tests the unstable logic.
So I rather want to see a more complete solution, which
1) lets the clocksource watchdog logic fail the test
2) lets the TSC sync (including TSC_ADJUST) logic on CPU hotplug fail
3) tweaks the TSC_ADJUST register and validates that the detection and
mitigation logic on systems w/o clocksource watchdog works
correctly.
Ideally that's a kunit test for CI integration plus a debugfs interface
for developers, which comes with a related selftest.
Thanks,
tglx
Powered by blists - more mailing lists