lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 27 Apr 2021 23:03:22 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Feng Tang <feng.tang@...el.com>,
        "Paul E. McKenney" <paulmck@...nel.org>
Cc:     linux-kernel@...r.kernel.org, john.stultz@...aro.org,
        sboyd@...nel.org, corbet@....net, Mark.Rutland@....com,
        maz@...nel.org, kernel-team@...com, neeraju@...eaurora.org,
        ak@...ux.intel.com, zhengjun.xing@...el.com,
        Xing Zhengjun <zhengjun.xing@...ux.intel.com>
Subject: Re: [PATCH v10 clocksource 6/7] clocksource: Forgive tsc_early pre-calibration drift

On Mon, Apr 26 2021 at 23:01, Feng Tang wrote:
> On Sun, Apr 25, 2021 at 03:47:07PM -0700, Paul E. McKenney wrote:
> We've reported one case that tsc can be wrongly judged as 'unstable'
> by 'refined-jiffies' watchdog [1], while reducing the threshold could
> make it easier to be triggered.
>
> It could be reproduced on the a plaform with a 115200 serial console,
> and hpet been disabled (several x86 platforms has this), add 
> 'initcall_debug' cmdline parameter to get more debug message, we can
> see:
>
> [    1.134197] clocksource: timekeeping watchdog on CPU1: Marking clocksource 'tsc-early' as unstable because the skew is too large:
> [    1.134214] clocksource:                       'refined-jiffies' wd_nesc: 500000000 wd_now: ffff8b35 wd_last: ffff8b03 mask: ffffffff

refined-jiffies is the worst of all watchdogs and this obviously cannot
be fixed at all simply because we can lose ticks in that mode. And no,
we cannot compensate for lost ticks via TSC which we in turn "monitor"
via ticks.

Even if we hack around it and make it "work" then the TSC will never
become fully trusted because refined-jiffies cannot support NOHZ/HIGHRES
mode for obvious reasons either. So the system stays in periodic mode
forever.

If there is no independent timer to validate against then the TSC is
better stable and the BIOS/SMM code has to be trusted not to wreckage
TSC. There are no other options.

So TBH, I do not care about this case at all. It's pointless to even
think about it. Either the TSC works on these systems or it doesn't. If
it doesn't, then you have to keep the pieces.

I'm so dead tired of this especially since this is known forever. But
it's obviously better to waste our time than to fix the damned hardware.

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ