lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1809190948470.1468@nanos.tec.linutronix.de>
Date:   Wed, 19 Sep 2018 09:53:47 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Waiman Long <longman@...hat.com>
cc:     John Stultz <john.stultz@...aro.org>, linux-kernel@...r.kernel.org,
        Stephen Boyd <sboyd@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH v2] clocksource: Warn if too many missing ticks are
 detected

On Tue, 18 Sep 2018, Waiman Long wrote:

> The clocksource watchdog, when running, is scheduled on all the CPUs in
> the system sequentially on a round-robin fashion with a period of 0.5s.
> A bug in the 4.18 kernel is causing missing ticks when nohz_full
> is specified. Under some circumstances, this causes the watchdog to
> incorrectly state that the TSC is unstable because of counter overflow
> in the hpet watchdog clock source after a few minutes delay.
> 
> That particular bug is fixed by the 4.19 commit 7059b36636beab ("sched:
> idle: Avoid retaining the tick when it has been stopped"). To make it
> easier to catch this kind of bug in the future, a check is added to see
> if there is too much delay in the invocation of the watchdog callback
> and print a warning once if it happens.

Second thoughts on this. Putting the check into the clocksource watchdog is
the wrong place as it's just checking at a place where the symptom
shows. What about putting it right to the source, i.e. in the timer wheel
as it does not depend on the clocksource watchdog being active. The
clocksource watchdog triggering is just one of the symptoms, but in general
timers being massively late is not a good thing.

Thanks,

	tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ