lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 11 Nov 2021 09:53:31 +0800 From: Feng Tang <feng.tang@...el.com> To: Waiman Long <longman@...hat.com> Cc: "Paul E. McKenney" <paulmck@...nel.org>, John Stultz <john.stultz@...aro.org>, Thomas Gleixner <tglx@...utronix.de>, Stephen Boyd <sboyd@...nel.org>, linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>, Cassio Neri <cassio.neri@...il.com>, Linus Walleij <linus.walleij@...aro.org>, Frederic Weisbecker <frederic@...nel.org> Subject: Re: [PATCH 0/2] clocksource: Avoid incorrect hpet fallback On Wed, Nov 10, 2021 at 08:30:10PM -0500, Waiman Long wrote: > > On 11/10/21 20:23, Feng Tang wrote: > > Hi Waiman, Paul, > > > > On Wed, Nov 10, 2021 at 05:17:30PM -0500, Waiman Long wrote: > > > It was found that when an x86 system was being stressed by running > > > various different benchmark suites, the clocksource watchdog might > > > occasionally mark TSC as unstable and fall back to hpet which will > > > have a signficant impact on system performance. > > We've seen similar cases while running 'netperf' and 'lockbus/ioport' > > cases of 'stress-ng' tool. > > > > In those scenarios, the clocksource used by kernel is tsc, while > > hpet is used as watchdog. And when the "screwing" happens, we found > > mostly it's the hpet's 'fault', that when system is under extreme > > pressure, the read of hpet could take a long time, and even 2 > > consecutive read of hpet will have a big gap (up to 1ms+) in between. > > So the screw we saw is actually caused by hpet instead of tsc, as > > tsc read is a lightweight cpu operation > > > > I tried the following patch to detect the screw of watchdog itself, > > and avoid wrongly judging the tsc to be unstable. It does help in > > our tests, please help to review. > > > > And one futher idea is to also adding 2 consecutive read of current > > clocksource, and compare its gap with watchdog's, and skip the check > > if the watchdog's is bigger. > > That is what I found too. And I also did a 2nd watchdog read to compare the > consecutive delay versus half the threshold and skip the test if it exceeds > it. My patch is actually similar in concept to what your patch does. Aha, yes, I missed that. I just got to office, and saw the disucssion around 0/2 patch and replied, without going through the patches, sorry about that. 0day reported some cases about stress-ng testing, and we are still testing differenct cases we've seen. Thanks, Feng > Cheers, > Longman
Powered by blists - more mailing lists