linux-kernel - Re: [PATCH] clocksource: don't run watchdog forever

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210304074316.GA43191@shbuild999.sh.intel.com>
Date:   Thu, 4 Mar 2021 15:43:16 +0800
From:   Feng Tang <feng.tang@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        John Stultz <john.stultz@...aro.org>,
        Stephen Boyd <sboyd@...nel.org>, linux-kernel@...r.kernel.org,
        Qais Yousef <qais.yousef@....com>, andi.kleen@...el.com
Subject: Re: [PATCH] clocksource: don't run watchdog forever

Hi Thomas,

On Wed, Mar 03, 2021 at 04:50:31PM +0100, Thomas Gleixner wrote:
> On Tue, Mar 02 2021 at 20:06, Feng Tang wrote:
> > On Tue, Mar 02, 2021 at 10:16:37AM +0100, Peter Zijlstra wrote:
> >> On Tue, Mar 02, 2021 at 10:54:24AM +0800, Feng Tang wrote:
> >> > clocksource watchdog runs every 500ms, which creates some OS noise.
> >> > As the clocksource wreckage (especially for those that has per-cpu
> >> > reading hook) usually happens shortly after CPU is brought up or
> >> > after system resumes from sleep state, so add a time limit for
> >> > clocksource watchdog to only run for a period of time, and make
> >> > sure it run at least twice for each CPU.
> >> > 
> >> > Regarding performance data, there is no improvement data with the
> >> > micro-benchmarks we have like hackbench/netperf/fio/will-it-scale
> >> > etc. But it obviously reduces periodic timer interrupts, and may
> >> > help in following cases:
> >> > * When some CPUs are isolated to only run scientific or high
> >> >   performance computing tasks on a NOHZ_FULL kernel, where there
> >> >   is almost no interrupts, this could make it more quiet
> >> > * On a cluster which runs a lot of systems in parallel with
> >> >   barriers there are always enough systems which run the watchdog
> >> >   and make everyone else wait
> >> > 
> >> > Signed-off-by: Feng Tang <feng.tang@...el.com>
> >> 
> >> Urgh.. so this hopes and prays that the TSC wrackage happens in the
> >> first 10 minutes after boot.
> 
> which is wishful thinking....
> 
> > Yes, the 10 minutes part is only based on our past experience and we
> > can make it longer. But if there was real case that the wrackage happened
> > long after CPU is brought up like days, then this patch won't help
> > much.
> 
> It really depends on the BIOS wreckage. On one of my machine it takes up
> to a day depending on the workload.

Thanks for sharing the info.

> Anything pre TSC_ADJUST wants the watchdog on. With TSC ADJUST available
> we can probably avoid it.
> 
> There is a caveat though. If the machine never goes idle then TSC adjust
> is not able to detect a potential wreckage. OTOH, most of the broken
> BIOSes tweak TSC only by a few cycles and that is usually detectable
> during boot. So we might be clever about it and schedule a check every
> hour when during the first 10 minutes a modification of TSC adjust is
> seen on any CPU.

I don't have much experience with tsc_adjust, and try to understand it:
The 'modification of TSC' here has 2 cases: ? 
* First read of 'TSC_ADJUST' MSR of a just boot CPU returns non-zero value
* Following read of 'TSC_ADJUST' doesn't equal to the 'tsc_adjust' value
  saved in per-cpu data.

Also, does the patch ("x86/tsc: mark tsc reliable for qualified platforms")
need to wait till this caveat case is solved? 

Thanks,
Feng




> 
> Where is this TSC_DISABLE_WRITE bit again?
> 
> Thanks,
> 
>         tglx
>