lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 10 Nov 2021 20:30:10 -0500
From:   Waiman Long <longman@...hat.com>
To:     Feng Tang <feng.tang@...el.com>,
        "Paul E. McKenney" <paulmck@...nel.org>
Cc:     John Stultz <john.stultz@...aro.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Stephen Boyd <sboyd@...nel.org>, linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Cassio Neri <cassio.neri@...il.com>,
        Linus Walleij <linus.walleij@...aro.org>,
        Frederic Weisbecker <frederic@...nel.org>
Subject: Re: [PATCH 0/2] clocksource: Avoid incorrect hpet fallback


On 11/10/21 20:23, Feng Tang wrote:
> Hi Waiman, Paul,
>
> On Wed, Nov 10, 2021 at 05:17:30PM -0500, Waiman Long wrote:
>> It was found that when an x86 system was being stressed by running
>> various different benchmark suites, the clocksource watchdog might
>> occasionally mark TSC as unstable and fall back to hpet which will
>> have a signficant impact on system performance.
>   
> We've seen similar cases while running 'netperf' and 'lockbus/ioport'
> cases of 'stress-ng' tool.
>
> In those scenarios, the clocksource used by kernel is tsc, while
> hpet is used as watchdog. And when the "screwing" happens, we found
> mostly it's the hpet's 'fault', that when system is under extreme
> pressure, the read of hpet could take a long time, and even 2
> consecutive read of hpet will have a big gap (up to 1ms+) in between.
> So the screw we saw is actually caused by hpet instead of tsc, as
> tsc read is a lightweight cpu operation
>
> I tried the following patch to detect the screw of watchdog itself,
> and avoid wrongly judging the tsc to be unstable. It does help in
> our tests, please help to review.
>
> And one futher idea is to also adding 2 consecutive read of current
> clocksource, and compare its gap with watchdog's, and skip the check
> if the watchdog's is bigger.

That is what I found too. And I also did a 2nd watchdog read to compare 
the consecutive delay versus half the threshold and skip the test if it 
exceeds it. My patch is actually similar in concept to what your patch does.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ