lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210806041503.GO4397@paulmck-ThinkPad-P17-Gen-1>
Date:   Thu, 5 Aug 2021 21:15:03 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Chao Gao <chao.gao@...el.com>
Cc:     Feng Tang <feng.tang@...el.com>,
        kernel test robot <oliver.sang@...el.com>,
        John Stultz <john.stultz@...aro.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Stephen Boyd <sboyd@...nel.org>,
        Jonathan Corbet <corbet@....net>,
        Mark Rutland <Mark.Rutland@....com>,
        Marc Zyngier <maz@...nel.org>, Andi Kleen <ak@...ux.intel.com>,
        Xing Zhengjun <zhengjun.xing@...ux.intel.com>,
        Chris Mason <clm@...com>, LKML <linux-kernel@...r.kernel.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
        zhengjun.xing@...el.com
Subject: Re: [clocksource]  8901ecc231:  stress-ng.lockbus.ops_per_sec -9.5%
 regression

On Fri, Aug 06, 2021 at 10:10:00AM +0800, Chao Gao wrote:
> On Thu, Aug 05, 2021 at 08:37:27AM -0700, Paul E. McKenney wrote:
> >On Thu, Aug 05, 2021 at 01:39:40PM +0800, Chao Gao wrote:
> >> [snip]
> >> >> This patch works well; no false-positive (marking TSC unstable) in a
> >> >> 10hr stress test.
> >> >
> >> >Very good, thank you!  May I add your Tested-by?
> >> 
> >> sure.
> >> Tested-by: Chao Gao <chao.gao@...el.com>
> >
> >Very good, thank you!  I will apply this on the next rebase.
> >
> >> >I expect that I will need to modify the patch a bit more to check for
> >> >a system where it is -never- able to get a good fine-grained read from
> >> >the clock.
> >> 
> >> Agreed.
> >> 
> >> >And it might be that your test run ended up in that state.
> >> 
> >> Not that case judging from kernel logs. Coarse-grained check happened 6475
> >> times in 43k seconds (by grep "coarse-grained skew check" in kernel logs).
> >> So, still many checks were fine-grained.
> >
> >Whew!  ;-)
> >
> >So about once per 13 clocksource watchdog checks.
> >
> >To Andi's point, do you have enough information in your console log to
> >work out the longest run of course-grained clocksource checks?
> 
> Yes. 5 consecutive course-grained clocksource checks. Note that
> considering the reinitialization after course-grained check, in my
> calculation, two course-grained checks are considered consecutive if
> they happens in 1s(+/- 0.3s).

Very good, thank you!

So it seems eminently reasonable to have the clocksource watchdog complain
bitterly for more than (say) 100 consecutive course-grained checks.

I am thinking in terms of a separate patch for this purpose.

Thoughts?

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ